Time Series Data in Python


Welcome to Earth Analytics Python - Week 3!

Welcome to week 3 of Earth Analytics! This week you will learn how to work with and plot time series data using Python and Jupyter Notebooks. You will learn how to:

  • Import text files that are tab delimted and comma separated into pandas.
  • Handle different date and time fields and formats in Pandas
  • Set a datetime field as an index when importing your data
  • Calulate the return time for a flood event.
  • Subset time series data by date
  • Handle missing data values in pandas.

What You Need

You will need the Colorado Flood Teaching data subset and a computer with Anaconda Python 3.x and the earth-analytics-python environment installed to complete this lesson

Download Colorado Flood Teaching Data Subset data

The data that you use this week was collected by US Agency managed sensor networks and includes:

  • USGS stream gage network data and
  • NOAA / National Weather Service precipitation data.

All of the data you work with were collected in Boulder, Colorado around the time of the 2013 floods.

Read the assignment below carefully. Use the class and homework lessons to help you complete the assignment.

Class Schedule

timetopicspeaker  
 Review Jupyter Notebooks / Raster data in Python / questionsLeah  
 Python coding session - Time Series Data in Pandas - PythonLeah  
 Break   
 Speaker - Matt Rossi - Understanding FloodsMatt Rossi  
 Return Time ActivityLeah & Matt  

Week 3

Important - Data Organization

After you have downloaded the data for this week, be sure that your directory is setup as specified below.

If you are working on your computer, locally, you will need to unzip the zip file. When you do this, be sure that your directory looks like the image below: note that all of the data are within the colorado-flood directory. The data are not nested within another directory. You may have to copy and paste your files into the correct directory to make this look right.

Your `week_02` file directory should look like the one above. Note that the data directly under the colorado-flood folder.
Your `colorado-flood` file directory should look like the one above. Note that the data directly under the colorado-flood folder.

If you are working in the Jupyter Hub or have the earth-analytics-python environment installed on your computer, you can use the earthpy download function to access the data. Like this:

import earthpy as et
et.data.get_data("colorado-flood")

Why Data Organization Matters

It is important that your data are organized as specified in the lessons because:

  1. When the instructors grade your assignments, we will be able to run your code if your directory looks like the instructors’.
  2. It will be easier for you to follow along in class if your directory is the same as the instructors.
  3. Your notebook becomes more reproducible if you use a standard working directory. Most computing environments have a default home directory. It is good practice to learn how to organize your files in a way that makes it easier for your future self to find and work with your data!

Homework Plots

Please visit CANVAS for the assignment and grading rubric. Below are examples of what your plots should look like. Note that you can modify the colors, style, etc of your plots as you’d like. These plots are just examples to help you visually check your homework.

Homework plot of Monthly max discharge data.
Homework plot of Monthly max discharge data.
Homework plot of Daily max discharge data.
Homework plot of Daily max discharge data.
Homework plot of Monthly total precipitation data.
Homework plot of Monthly total precipitation data.
Homework plot of Daily total discharge data.
Homework plot of Daily total discharge data.

Note: to plot the y axis on a log scale use the argument: logy= True in your pandas .plot() call. If you use matplotlib to plot the data then you will want to calculate the log value in a new column and plot that.

Probabilty of Stream discharge events plot.
Probabilty of Stream discharge events plot.
Return period for stream discharge events plot.
Return period for stream discharge events plot.
Probabiltiy for precipitation events plot.
Probabiltiy for precipitation events plot.
Return period for precipitation events plot.
Return period for precipitation events plot.