Lesson 4. Understand EPSG, WKT and Other CRS Definition Styles


This lesson discusses ways that coordinate reference system data are stored including proj4, well known text (wkt) and EPSG codes.

Learning Objectives

After completing this tutorial, you will be able to:

  • Identify the proj4 vs EPSG vs WKT crs format when presented with all three formats
  • Look up a CRS definition in proj4, EPSG or WKT formats using spatialreference.org

What You Need

You will need a computer with internet access to complete this lesson and the spatial-vector-lidar data subset created for the course.

Download Spatial Lidar Teaching Data Subset data

or using the earthpy package:

et.data.get_data("spatial-vector-lidar")

In the previous lessons we learned what a coordinate reference system (CRS) is, the components of a coordinate reference system and the general differences between projected and geographic coordinate reference systems. In this lesson we will cover the different ways that CRS information is stored.

Coordinate Reference System Formats

There are numerous formats that are used to document a CRS. Three common formats include:

  • proj.4
  • EPSG
  • Well-known Text (WKT) formats.

Often you have CRS information in one format and you need to translate that CRS into a different format to use in a tool like Python. Thus it is good to be familiar with some of the key formats that you are likely to encounter.

One of the most powerful websites to look up CRS strings is Spatialreference.org. You can use the search on the site to find an EPSG code. Once you find the page associated with your CRS of interest you can then look at all of the various formats associated with that CRS: EPSG 4326 - WGS84 geographic

PROJ or PROJ.4 strings

PROJ.4 strings are a compact way to identify a spatial or coordinate reference system. PROJ.4 strings are one of the formats that Geopandas can accept. However, note that many libraries are moving towards the more concise EPSG format.

Using the PROJ.4 syntax, you specify the complete set of parameters including the ellipse, datum, projection units and projection definition that define a particular CRS.

Break down the proj.4 format

Below is an example of a proj.4 string:

+proj=utm +zone=11 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0

Notice that the crs information is structured using a string of characters and numbers that are combined using + signs. The CRS for your data are in the proj4 format. The string contains all of the individual CRS elements that Python or another GIS might need. Each element is specified with a + sign, similar to how a .csv file is delimited or broken up by a ,. After each + we see the CRS element being defined. For example +proj= and +datum=.

You can break down the proj4 string into its individual components (again, separated by + signs) as follows:

  • +proj=utm: the projection is UTM, UTM has several zones.
  • +zone=11: the zone is 11 which is a zone on the west coast, USA.
  • datum=WGS84: the datum WGS84 (the datum refers to the 0,0 reference for the coordinate system used in the projection)
  • +units=m: the units for the coordinates are in METERS.
  • +ellps=WGS84: the ellipsoid (how the earth’s roundness is calculated) for the data is WGS84

Note that the zone is unique to the UTM projection. Not all CRSs will have a zone.

Also note that while California is above the equator - in the northern hemisphere - there is no N (specifying north) following the zone (i.e. 11N) South is explicitly specified in the UTM proj4 specification however if there is no S, then you can assume it’s a northern projection.

Geographic (lat / long) Proj.4 String

Next, look at another CRS definition.

+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0

This is a lat/long or geographic projection. The components of the proj4 string are broken down below.

  • +proj=longlat: the data are in a geographic (latitude and longitude) coordinate system
  • datum=WGS84: the datum WGS84 (the datum refers to the 0,0 reference for the coordinate system used in the projection)
  • +ellps=WGS84: the ellipsoid (how the earth’s roundness is calculated) is WGS84

Note that there are no specified units above. This is because this geographic coordinate reference system is in latitude and longitude which is most often recorded in Decimal Degrees.

Data Tip: the last portion of each proj4 string is +towgs84=0,0,0 . This is a conversion factor that is used if a datum conversion is required.

EPSG codes

The EPSG codes are 4-5 digit numbers that represent CRSs definitions. The acronym EPSG, comes from the, now defunct, European Petroleum Survey Group. Each code is a four-five digit number which represents a particular CRS definition.

Explore ESPG codes on spatialreference.org .

Import the worldBoundary layer that you’ve been working with in this module to explore the CRS.

from glob import glob
import numpy as np
import os
import matplotlib.pyplot as plt
import pandas as pd
import geopandas as gpd
from shapely.geometry import Point
import earthpy as et

# render plots inline on the page
plt.ion()

os.chdir(os.path.join(et.io.HOME, 'earth-analytics'))
# Import world boundary shapefile
worldBound = gpd.read_file(
    "data/spatial-vector-lidar/global/ne_110m_land/ne_110m_land.shp")
worldBound.crs
{'init': 'epsg:4326'}

Notice that the CRS returned above, consists of two parts:

  1. ‘init’ which tells python that a CRS definition (ie EPSG code) will be provided and
  2. the epsg code itself epsg: 4326

How to Create a CRS Object in Python

You often need to define the CRS for a spatial object. For example in the previous lessons, you created new spatial point layers, and had to define the CRS that the point x,y locations were in.

To do this you completed the following steps:

  1. You manually created an array for a single point (x,y).
  2. You turned that x,y point into a shapely points object
  3. Finally convert that point object to a pandas GeoDataFrame
# Create a numpy array with x,y location of Boulder
boulder_xy = np.array([[476911.31, 4429455.35]])

# Create shapely point object
boulder_xy_pt = [Point(xy) for xy in boulder_xy]

# Convert to spatial dataframe - geodataframe -- assign the CRS using epsg code
boulder_loc = gpd.GeoDataFrame(boulder_xy_pt,
                               columns=['geometry'],
                               crs={'init': 'epsg:2957'})

# View crs of new spatial points object
boulder_loc.crs
{'init': 'epsg:2957'}

WKT or Well-known Text

It’s useful to recognize this format given many tools - including ESRI’s ArcMap and ENVI use this format. Well-known Text (WKT) is a for compact machine- and human-readable representation of geometric objects. It defines elements of coordinate reference system (CRS) definitions using a combination of brackets [] and elements separated by commas (,).

Here is an example of WKT for WGS84 geographic:

GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137,298.257223563]],PRIMEM["Greenwich",0],UNIT["Degree",0.017453292519943295]]

Notice here that the elements are described explicitly using all caps - for example:

  • UNIT
  • DATUM

Sometimes WKT structured CRS information are embedded in a metadata file - similar to the structure seen below:


GEOGCS["WGS 84",
    DATUM["WGS_1984",
        SPHEROID["WGS 84",6378137,298.257223563,
            AUTHORITY["EPSG","7030"]],
        AUTHORITY["EPSG","6326"]],
    PRIMEM["Greenwich",0,
        AUTHORITY["EPSG","8901"]],
    UNIT["degree",0.01745329251994328,
        AUTHORITY["EPSG","9122"]],
    AUTHORITY["EPSG","4326"]]

How to Look Up a CRS

The most powerful website to look-up CRS information is the spatial reference.org website. This website has a useful search function that allows you to search for strings such as:

  • UTM 11N or
  • WGS84

Once you find the CRS that you are looking for, you can explore definitions of the CRS using various formats including proj4, epsg, WKT and others.

Updated:

Leave a Comment