Introduction to Shapefiles and Vector Data in Open Source Python


Welcome to Week 4!

Welcome to week 4 of Earth Analytics! This week, you will dive deeper into working with spatial data in Python. You will learn how to handle data in different coordinate reference systems, how to create custom maps and legends and how to extract data from a raster file. You are on your way towards integrating many different types of data into your analysis which involves knowing how to deal with things like coordinate reference systems and varying data structures.

What You Need

You will need a computer with internet access to complete this lesson and the spatial-vector-lidar data subset created for the course. Note that the data download below is large (172MB) however it contains data that you will use for the next 2 weeks!

Download Spatial Lidar Teaching Data Subset data

or using the earthpy package:

et.data.get_data("spatial-vector-lidar")

TimeTopicSpeaker  
9:30 AMQuestions / PythonLeah  
9:45 - 10:15Coordinate reference systems & spatial metadata 101   
10:25 - 12:20Python coding session - spatial data in PythonLeah  

Plot 1 - Roads Map and Legend

Map showing the SJER field site roads and plot locations clipped to the site boundary.
Map showing the SJER field site roads and plot locations clipped to the site boundary.

Plot 2 - Roads in Del Norte, Modoc & Siskiyou Counties in California

Map showing the roads layer clipped to the three counties and colored according to which county the road is in.
Map showing the roads layer clipped to the three counties and colored according to which county the road is in.

Plot 3 - Quantile Map for The USA

Total land and total water aggregated by region in the United States.
Total land and total water aggregated by region in the United States.

Plot 4

You can use the code below to download and unzip the data from the Natural Earth website. Please note that the download function was written to take

  1. a download path - this is the directory where you want to store your data
  2. a url - this is the URL where the data are located. The URL below might look odd as it has two “http” strings in it but it is how the url’s are organized on natural earth and should work.

The download() function will unzip your data for you and place it in the directory that you specify.

# Add this line importing the download package to your top cell with the other packages!
from download import download

# Get the data from natural earth
url = "https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_0_countries.zip"

# Please note that this is the directory name where your data will be unzipped
download_path = os.path.join("data", "spatial-vector-lidar", "global","ne_10m_admin_0_countries")
download(url, download_path, kind='zip', verbose=False)
country_path = os.path.join(download_path, "ne_10m_admin_0_countries.shp")
/Users/leah-su/anaconda3/envs/earth-analytics-python/lib/python3.6/site-packages/pandas/core/reshape/merge.py:544: UserWarning: merging between different levels can give an unintended result (1 levels on the left, 2 on the right)
  warnings.warn(msg, UserWarning)
Natural Earth Global Mean population rank and total estimated population
Natural Earth Global Mean population rank and total estimated population

Updated: