Lesson 4. File Formats Exercise
# Importing packages needed to complete this lesson import os import pandas as pd import geopandas as gpd import matplotlib.pyplot as plt import earthpy as et
# Creating a home directory home_dir = os.path.join(et.io.HOME, 'earth-analytics', 'data', 'earthpy-downloads') if not os.path.isdir(home_dir): os.makedirs(home_dir) # Set your working directory os.chdir(os.path.join(et.io.HOME, 'earth-analytics', 'data', 'earthpy-downloads'))
Challenge 1: Open a Text File
Use the code below to download a
.csv file containing data for the climbing formations in the Boulder, Colorado:
Once you have downloaded the data:
- Read the data into Python as a pandas
DataFrame. IMPORTANT: Name your dataframe object boulder_climbing.
- View the pandas
DataFrame. Look at the columns in the data. Find the
FormationTypecolumn. Notice how it’s categorically split between two different types of formations.
# Download the data that you will use in this lesson et.data.get_data( url="https://opendata.arcgis.com/datasets/175425c25d8849b58feb89483ef02961_1.csv")
Downloading from https://opendata.arcgis.com/datasets/175425c25d8849b58feb89483ef02961_1.csv
# This code will clean up your file name # This is a temporary fix for a bug in our earthpy package! old_name_climb = '"OSMP_Climbing_Formations.csv"' new_name_climb = 'OSMP_Climbing_Formations.csv' if not os.path.exists(new_name_climb): os.rename(old_name_climb, new_name_climb)
IMPORTANT. When you download the data, you may notice that there are quotes around the file name like this:
"OSMP_Climbing_Formations.csv". You will need to call
|0||-105.294224||40.005020||1||1.0||Pumpkin Rock||4.0||No||OSMP||N||Flagstaff||First Areas||N||No||Boulder||Yes|
|2||-105.293598||39.995411||3||3.0||Third Pinnacle||7.0||No||OSMP||Y||Gregory Canyon||NaN||N||No||Wall||Yes|
|4||-105.292811||39.995952||5||6.0||First Pinnacle||23.0||No||OSMP||Y||Gregory Canyon||NaN||N||No||Wall||Yes|
|449||-105.288583||39.978690||1178||NaN||Ridge One (Like Heaven)||NaN||No||OSMP||N||NCAR||NaN||N||No||Boulder||Yes|
|450||-105.289838||39.978304||1179||NaN||Ridge Two (Satans Slab)||NaN||No||OSMP||Y||NCAR||NaN||N||No||Boulder||Yes|
453 rows × 15 columns
How to Convert x,y Coordinate Data To A GeoDataFrame (or shapefile) - Spatial Data in Tabular Formats
Often you will find that tabular data, stored in a text or spreadsheet format, contains spatial coordinate information that you wish to plot or convert to a shapefile for use in a GIS application. In the challenge below, you will learn how to convert tabular data containing coordinate information into a spatial file.
Challenge 2: Create a Spatial GeoDataframe From a DataFrame
You can create a Geopandas
GeoDataFrame from a Pandas
DataFrame if there is coordinate data in the DataFrame. In the data that you opened above, there are columns for the
Y coordinates of each rock formation - with headers named
You can convert columns containing x,y coordinate data using the GeoPandas
points_from_xy() function as follows:
coordinates = gpd.points_from_xy(column-with-x-data, column-with-y-data.Y)
You can then set the geometry column for the new GeoDataFrame to the x,y data that you extracted from the data frame.
GeoDataFrame. Copy the code below to create a new GeoDataFrame containing the boulder climbing area data in a spatial format that you can plot.
IMPORTANT: be sure to assign the output of the code below to a new variable name called
coordinates = gpd.points_from_xy(boulder_climbing.X, boulder_climbing.Y) gpd.GeoDataFrame(data=boulder_climbing, geometry=coordinates)
In your code:
- Copy the code above to create a
DataFramethat you created above.
- Next, plot your data using
Data Tip: You can easily export data in a GeoPandas format to a shapefile using
object_name_here.to_file("file-name-here.shp"). Following the example above, if you want to export a shapefile called boulder-climbing.shp, your code would look like this:
Challenge 3: Create a Base Map
Next, you will create a basemap. Run code below to download another file for boulder. Notice that the data this time are in
geojson format rather than a shapefile. Even though the format is different, the data can be worked with using Geopandas in the same way that you would work with a shapefile using
The data file is:
The code below downloads and cleans up the file name.
# Get the data et.data.get_data( url="https://opendata.arcgis.com/datasets/955e7a0f52474b60a9866950daf10acb_0.geojson") # This code will clean up your file name # This is a temporary fix for a bug in our earthpy package! old_name_city = '"City_Limits.geojson"' new_name_city = 'City_Limits.geojson' if not os.path.exists(new_name_city): os.rename(old_name_city, new_name_city)
Downloading from https://opendata.arcgis.com/datasets/955e7a0f52474b60a9866950daf10acb_0.geojson
Challenge 4: Plot Two GeoDataFrames Together in the Same Figure
Previously, you learned how to plot multiple shapefiles or spatial layers on the same map using matplotlib.
- Use what you learned in the spatial vector lesson in this chapter to plot the climbing formations points layer on top of the cities boundary that you opened above.
- Use the
color=parameters to change the colors of the city object. (example: color=”white”, edgecolor=”grey”)
legend=Trueto add a legend to your map.
column='FormationType'to plot your points according tot he type of climbing formation it is (Boulder vs Wall).
HINT: Refer back to the vector lesson if you forget how to create your plot!
Challenge 5: Customize Your Map
Next, you will customize the map that you created above. Here’s what you need to do to spruce up your map:
- Add a title to your map using
- Set the
figsizeof the map to be larger so the data is more clearly shown. The
figsizeis one of the arguments in
plt.subplotsand needs to be set to a tuple of numbers. For example:
- Turn off the x and y axis data ticks to make the plot look more like a map using:
- Customize the colors of the city boundary using the parameters:
color="color-name-here"to change the color of the fill of the polygon. Use
edgecolor="color-name-here"to change the outline color of the polygon. HINT: you may want to set
color="white"for the polygon and make the edgecolor a darker color so you have a clean outline.
- Play around with modifying the markers for the points. The marker is the symbol used to represent the x,y location. The default marker is a circle. Modify the
markersize=parameters in the
plot()function for the climbing formations in order to make it more legible. Here is a list of marker options in matplotlib: https://matplotlib.org/3.2.1/api/markers_api.html.
Examples of modifying the marker and marker size:
OPTIONAL: See what happens when you use the
HINT: see this documentation to learn more about color maps in python: https://matplotlib.org/3.2.1/tutorials/colors/colormaps.html
Have fun customizing your map!
OPTIONAL: Interactive Spatial Maps Using Folium
Above you created maps that were static that you could not interact with. You can make interactive maps with Python in Jupyter Notebooks too using the Folium package.
Set your GeoDataFrame name for your climbing formations to the variable specified in the code below,
import folium #Define coordinates of where we want to center our map map_center_coords = [40.015, -105.2705] #Create the map my_map = folium.Map(location = map_center_coords, zoom_start = 13) for lat,long in zip(climbing_locations.geometry.y, climbing_locations.geometry.x): folium.Marker( location=[lat, long], ).add_to(my_map) my_map
and run it in your code to see what happens!
More reading on how to use Folium here
# In this cell, uncomment the line below. # This should set your GeoDataFrame to our # variable name to make the code with folium run # climbing_locations = boulder_climbing_gdf
BONUS Challenge: Clip Climbing Formations to the City of Boulder
In the vector notebook, you learned how to clip spatial data. In your code, do the following:
- Clip the climbing formations to the boundary of the city of Boulder.
- Plot the clipped points on top of the city boundary.
If you want, you could create another folium map of the clipped data!
<ipython-input-15-e65987880be3>:5: UserWarning: CRS mismatch between the CRS of left geometries and the CRS of right geometries. Use `to_crs()` to reproject one of the input geometries to match the CRS of the other. Left CRS: None Right CRS: EPSG:4326 climbing_in_boulder = gpd.clip(boulder_climbing_gdf, city_limits)
Leave a Comment