# Lesson 7.The Relationship Between Precipitation and Stream DischargeExplore Mass Balance

## Introduction to Flood Frequency Analysis

import hydrofunctions as hf
import urllib
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os
import math
plt.ion()
# set standard plot parameters for uniform plotting
plt.rcParams['figure.figsize'] = (11, 6)
# prettier plotting with seaborn
import seaborn as sns;
sns.set(font_scale=1.5)
sns.set_style("whitegrid")
# set working dir and import earthpy
import earthpy as et
os.chdir(os.path.join(et.io.HOME, 'earth-analytics'))


picking up from the previous lesson…

# define the site number and start and end dates that you are interested in
site = "06730500"
start = '1946-05-10'
end = '2018-08-29'

# then request data for that site and time period
longmont_resp = hf.get_nwis(site, 'dv', start, end)
# get the data in a pandas dataframe format
longmont_discharge = hf.extract_nwis_df(longmont_resp)
# rename columns
longmont_discharge.columns = ["discharge", "flag"]
# view first 5 rows
longmont_discharge["year"]=longmont_discharge.index.year

# Calculate annual max by resampling
longmont_discharge_annual_max = longmont_discharge.resample('AS').max()

dischargeflagyear
datetime
1946-01-0199.0A1946.0
1947-01-011930.0A1947.0
1948-01-01339.0A1948.0
1949-01-012010.0A1949.0
1950-01-01NaNNaNNaN
# download usgs annual max data from figshare
url = "https://nwis.waterdata.usgs.gov/nwis/peak?site_no=06730500&agency_cd=USGS&format=rdb"
# open the data using pandas
skiprows = 63,
sep='\t',
parse_dates = [2])
# drop one level of index
usgs_annual_max.columns = usgs_annual_max.columns.droplevel(1)
# finally set the date column as the index
usgs_annual_max = usgs_annual_max.set_index(['peak_dt'])

# optional - remove columns we don't need - this is just to make the lesson easier to read
# you can skip this step if you want
usgs_annual_max = usgs_annual_max.drop(["gage_ht_cd", "year_last_pk","ag_dt", "ag_gage_ht", "ag_tm", "ag_gage_ht_cd"], axis=1)

# add a year column to the data for easier plotting
usgs_annual_max["year"] = usgs_annual_max.index.year
# remove duplicate years - keep the max discharge value
usgs_annual_max = usgs_annual_max.sort_values('peak_va', ascending=False).drop_duplicates('year').sort_index()
# view cleaned dataframe

agency_cdsite_nopeak_tmpeak_vapeak_cdgage_htyear
peak_dt
1927-07-29USGS673050006:00407.053.001927
1928-06-04USGS673050009:00694.053.841928
1929-07-23USGS673050015:00530.053.401929
1930-08-18USGS673050005:00353.052.941930
1931-05-29USGS673050009:00369.052.881931
# plot calculated vs USGS annual max flow values
fig, ax = plt.subplots(figsize = (11,9))
ax.plot(usgs_annual_max["year"],
usgs_annual_max["peak_va"],
color = "purple",
linestyle=':',
marker='o',
label = "USGS Annual Max")
ax.plot(longmont_discharge_annual_max["year"],
longmont_discharge_annual_max["discharge"],
color = "lightgrey",
linestyle=':',
marker='o', label = "Calculated Annual Max")
ax.legend()
ax.set_title("Annual Maxima - USGS Peak Flow vs Daily Calculated");


## Calculate Cumulative Sum

Next you will create a plot that shows both stream discharge the the total cumulative runnof that it represents over the time period of interest. This plot is useful as you will be able to compare this to a plot of precipitation that you create for your homework.

Together - stream runoff and precipitation can be explored to better understand the mass balance of water in your watershed of interest. The total precipitation in the watershed minus the total runoff can be used to calculate how much water is being “lost” in the system to evapotranspiration. The steps are as follows:

1. Calculate the cumulative sum using the .cumsum() method in pandas.
2. Convert CFS (Cubic Feet per Second) to a more meaningful unit of runoff by
• converting CFS to Cubic feet per day
• divide this value by the total area in the watershed to get a volume of water per area

USGS Site page has the area of the site drainage area: 447 square miles

# convert site drainage area to square km
miles_km = 2.58999
site_drainage = 447
longmont_area = site_drainage * miles_km
print("The site drainage area in square km =", longmont_area)

The site drainage area in square km = 1157.72553


Next you calculate the cumulative sum, convert that to cubic feet per day and then divide by the drainage area.

convert_to_cub_feet_day = (60*60*24)

convert_to_runoff = convert_to_cub_feet_day*longmont_area
convert_to_runoff

100027485.792

# MAR - Mean Annual Runoff
longmont_discharge["cum-sum-vol"] = longmont_discharge['discharge'].cumsum()*convert_to_runoff

dischargeflagyearcum-sum-vol
datetime
1946-05-1016.0A19461.600440e+09
1946-05-1119.0A19463.500962e+09
1946-05-129.0A19464.401209e+09
1946-05-133.0A19464.701292e+09
1946-05-147.8A19465.481506e+09

## Plot Cumulative Sum of Runnof and Daily Mean Discharge Together

Finally you can plot cumulative sum on top of your discharge values. This plot is an interesting way to to view increases and decreases in discharge as they occur over time.

### Creating this Plot

Notice below you have two sets of data with different Y axes on the same plot. The key to making this work is this:

ax2 = ax.twinx()

Where you define a second axis but tell matplotlib to create that axis on the ax object in your figure.

# plot your data
fig, ax = plt.subplots(figsize=(11,7))
longmont_discharge["cum-sum-vol"].plot(ax=ax, label = "Cumulative Volume")

# Make the y-axis label, ticks and tick labels match the line color.
ax.set_ylabel('Total Area Runoff', color='b')
ax.tick_params('y', colors='b')

ax2 = ax.twinx()
ax2.scatter(x=longmont_discharge.index,
y=longmont_discharge["discharge"],
marker="o",
s=4,
color ="purple", label="Daily Mean")
ax2.set_ylabel('Stream Discharge (CFS)', color='purple')
ax2.tick_params('y', colors='purple')
ax2.set_ylim(0,10000)
ax.set_title("Cumulative Sum & Daily Mean Discharge")
ax.legend()

# reposition the second legend so it renders under the first legend item
ax2.legend(loc = "upper left", bbox_to_anchor=(0.0, 0.9))
fig.tight_layout()
plt.show()