# Lesson 8. How to Join Attributes From One Shapefile to Another in Open Source Python Using Geopandas: GIS in Python

## Learning Objectives

After completing this tutorial, you will be able to:

• Join spatial attributes from one shapefile to another using geopandas in python.
• Calculate line segment length using geopandas

## What You Need

You will need a computer with internet access to complete this lesson and the spatial-vector-lidar data subset created for the course.

or using the earthpy package:

et.data.get_data("spatial-vector-lidar")

# Import libraries
import os
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.lines as mlines
from matplotlib.colors import ListedColormap
import pandas as pd
import geopandas as gpd
import earthpy as et
from shapely.geometry import box

# Plot figures inline
plt.ion()

# Set working directory
os.chdir(os.path.join(et.io.HOME, 'earth-analytics'))

# Import clip_data.py as a module so you can access the clip data functions
import clip_data as cl

# Import data



Next dissolve the state data by region like you did in the previous lesson.

# Simplify the country boundary just a little bit to make this run faster
country_bound_us_simp = country_bound_us.simplify(.2, preserve_topology=True)

# Clip the roads to the US boundary - this will take about a minute to execute

# Dissolve states by region
regions_agg = state_bound_us.dissolve(by="region")


## Spatial Joins in Python

Just like you might do in ArcMap or QGIS you can perform spatial joins in Python too. A spatial join is when you append the attributes of one layer to another based upon its spatial relationship.

So - for example if you have a roads layer for the United States, and you want to apply the “region” attribute to every road that is spatially in a particular region, you would use a spatial join. To apply a join you can use the geopandas.sjoin() function as following:

.sjoin(layer-to-add-region-to, region-polygon-layer)

### Sjoin Arguments:

The op argument specifies the type of join that will be applied

• intersects: Returns True if the boundary and interior of the object intersect in any way with those of the other.
• within: Returns True if the object’s boundary and interior intersect only with the interior of the other (not its boundary or exterior).
• contains: Returns True if the object’s interior contains the boundary and interior of the other object and their boundaries do not touch at all.

How allows the following options: (this is taken directly from the geopandas code on github!

• ‘left’: use keys from left_df; retain only left_df geometry column
• ‘right’: use keys from right_df; retain only right_df geometry column
• ‘inner’: use intersection of keys from both dfs; retain only left_df geometry column
# Roads within region
regions_agg,
how="inner",
op='intersects')

# Notice once you have joins the data - you have attributes
# from the regions_object (i.e. the region) attached to each road feature

featureclaindex_rightALAND

Plot the data.

# Reproject to Albers for plotting
country_albers = country_bound_us.to_crs({'init': 'epsg:5070'})

# Plot the data
fig, ax = plt.subplots(figsize=(12, 8))
country_albers.plot(alpha=1,
facecolor="none",
edgecolor="black",
zorder=10,
ax=ax)
ax=ax,
legend=True)
ax.set_axis_off()
plt.axis('equal')
plt.show()


If you want to customize your legend even further, you can once again use loops to do so.

# First, create a dictionary with the attributes of each legend item
'Northeast': ['grey'],
'Southeast': ['m'],
'Southwest': ['purple'],
'West': ['green']}

# Plot the data
fig, ax = plt.subplots(figsize=(12, 8))
regions_agg.plot(edgecolor="black",
ax=ax)
country_albers.plot(alpha=1,
facecolor="none",
edgecolor="black",
zorder=10,
ax=ax)

label=ctype,
ax=ax)
plt.legend(bbox_to_anchor=(1.0, 1), loc=2)
ax.set_axis_off()
plt.axis('equal')
plt.show()


## Calculate Line Segment Length

# Turn off scientific notation
pd.options.display.float_format = '{:.4f}'.format

# Calculate the total length of road

# Sum existing columns