Reproducible Science and Programming - Python

Work with Landsat Remote Sensing Data in Python

Landsat 8 data are downloaded in tif file format. Learn how to open and manipulate Landsat 8 data in Python. Also learn how to create RGB and color infrared Landsat image composites.

last updated: 02 Apr 2020

Customize Dates on Time Series Plots in Python Using Matplotlib

When you plot time series data using the matplotlib package in Python, you often want to customize the date format that is presented on the plot. Learn how to customize the date format on time series plots created using matplotlib.

last updated: 03 Mar 2020

Work With Datetime Format in Python - Time Series Data

Python provides a datetime object for storing and working with dates. Learn how you can convert columns in a pandas dataframe containing dates and times as strings into datetime objects for more efficient analysis and plotting.

last updated: 03 Mar 2020

Write Functions with Multiple Parameters in Python

A function is a reusable block of code that performs a specific task. Learn how to write functions that can take multiple as well as optional parameters in Python to eliminate repetition and improve efficiency in your code.

last updated: 06 Mar 2020

Write Functions in Python

A function is a reusable block of code that performs a specific task. Learn how to write functions in Python to eliminate repetition and improve efficiency in your code.

last updated: 06 Mar 2020

Introduction to Writing Functions in Python

A function is a reusable block of code that performs a specific task. Learn how functions can be used to write efficient and DRY (Do Not Repeat Yourself), code in Python.

last updated: 13 Mar 2020

Automate Data Tasks With Loops in Python

Loops can be used to automate data tasks in Python by iteratively executing the same code on multiple data structures. Learn how to automate data tasks in Python using data structures such as lists, numpy arrays, and pandas dataframes.

last updated: 06 Mar 2020

Intro to Loops in Python

Loops can help reduce repetition in code by iteratively executing the same code on a range or list of values. Learn about the basic types of loops in Python and how they can be used to write Do Not Repeat Yourself, or DRY, code in Python.

last updated: 08 Jan 2020

Conditional Statements with Alternative or Combined Conditions

Conditional statements in Python can be written to check for alternative conditions or combinations of multiple conditions. Learn how to write conditional statements in Python that choose betweeen alternative conditions or check for combinations of conditions before executing code.

last updated: 06 Mar 2020

Intro to Conditional Statements in Python

Conditional statements help you to control the flow of code by executing code only when certain conditions are met. Learn about the structure of conditional statements in Python and how they can be used to write Do Not Repeat Yourself, or DRY, code in Python.

last updated: 08 Jan 2020

Select Data From Pandas Dataframes

Pandas dataframes are a commonly used scientific data structure in Python that store tabular data using rows and columns with headers. Learn how to use indexing and filtering to select data from pandas dataframes.

last updated: 02 Nov 2019

Run Calculations and Summary Statistics on Pandas Dataframes

Pandas dataframes are a commonly used scientific data structure in Python that store tabular data using rows and columns with headers. Learn how to run calculations and summary statistics (such as mean or maximum) on columns in pandas dataframes.

last updated: 12 Oct 2019

Import CSV Files Into Pandas Dataframes

Pandas dataframes are a commonly used scientific data structure in Python that store tabular data using rows and columns with headers. Learn how to import text data from .csv files into numpy arrays.

last updated: 14 Jan 2020

Intro to Pandas Dataframes

Pandas dataframes are a commonly used scientific data structure in Python that store tabular data using rows and columns with headers. Learn about the key characteristics of pandas dataframes that make them a useful data structure for storing and working with labeled scientific datasets.

last updated: 12 Oct 2019

Slice (or Select) Data From Numpy Arrays

Numpy arrays are an efficient data structure for working with scientific data in Python. Learn how to use indexing to slice (or select) data from one-dimensional and two-dimensional numpy arrays.

last updated: 02 Nov 2019

Run Calculations and Summary Statistics on Numpy Arrays

Numpy arrays are an efficient data structure for working with scientific data in Python. Learn how to run calculations and summary statistics (such as mean or maximum) on one-dimensional and two-dimensional numpy arrays.

last updated: 02 Nov 2019

Import Text Files Into Numpy Arrays

Numpy arrays are an efficient data structure for working with scientific data in Python. Learn how to import text data from .txt and .csv files into numpy arrays.

last updated: 21 Oct 2019

Intro to Numpy Arrays

Numpy arrays are a commonly used scientific data structure in Python that store data as a grid, or a matrix. Learn about the key characteristics of numpy arrays that make them an efficient data structure for storing and working with large scientific datasets.

last updated: 03 Oct 2019

Use the OS and Glob Python Packages to Manipulate File Paths

The os and glob packages are very useful tools in Python for accessing files and directories and for creating lists of paths to files and directories, respectively. Learn how to manipulate and parse file and directory paths using os and glob.

last updated: 03 Mar 2020

Install Packages in Python

Packages in Python provide pre-built functionality that adds to the functionality available in base Python. Learn how to install packages in Python using conda environments.

last updated: 05 Oct 2019

Python Packages for Earth Data Science

The Python programming language provides many packages and libraries for working with scientific data. Learn about key Python packages for earth data science.

last updated: 14 Jan 2020

Customize Your Plots Using Matplotlib

Matplotlib is the most commonly used plotting library in Python. Learn how to customize the colors, symbols, and labels on your plots using matplotlib.

last updated: 03 Mar 2020

DRY Code and Modularity

DRY (Do Not Repeat Yourself) code supports reproducibility by removing repetition and making code easier to read. Learn about key strategies to write DRY code in Python.

last updated: 08 Jan 2020

Basic Operators in Python

Operators are symbols in Python that carry out a specific computation, or operation, such as arithmetic calculations. Learn how to use basic operators in Python.

last updated: 31 Oct 2019

Lists in Python

A Python list is a data structure that stores a collection of values in a specified order (or sequence) and is mutable (or changeable). Learn how to create and work with lists in Python.

last updated: 05 Oct 2019

Variables in Python

Variables store data (i.e. information) that you want to re-use in your code (e.g. single numeric value, path to a directory or file). Learn how to to create and work with variables in Python.

last updated: 14 Jan 2020

Data Wrangling With Pandas

This lesson teaches you how to wrangle data (e.g. subselect, update, and combine) with pandas dataframes.

last updated: 10 Sep 2018

Tools For Open Reproducible Science

Key tools for open reproducible science include Shell (Bash), git and GitHub, Jupyter, and Python. Learn how these tools help you implement open reproducible science workflows.

last updated: 23 Jan 2020

What Is Open Reproducible Science

Open reproducible science refers to developing workflows that others can easily understand and use. It enables you to build on others' work rather than starting from scratch. Learn about the importance and benefits of open reproducible science.

last updated: 14 Jan 2020

How Do You Design and Automate a Data Workflow

Designing and developing data workflows can help you complete your work more efficiently by allowing you to repeat and automate data tasks. Learn how to design and develop efficient workflows to automate data analyses in Python.

last updated: 30 Mar 2020

Learn to Write Pseudocode for Python Programming

Pseudcode can help you design data workflows through listing out the individual steps of workflow in plain language, so the focus is on the overall data process, rather than on the specific code needed. Learn best practices for writing pseudocode for data workflows.

last updated: 02 Apr 2020

Introduction to Documenting Python Software

Lack of documentation will limit peoples’ use of your code. In this lesson you will learn about 2 ways to document python code using docstrings and online documentation. YOu will also learn how to improve documentation in other software packages.

last updated: 01 Apr 2020

Activity on Dry Code

This activity provides an opportunity to practice writing DRY code using loops, conditional statements, and functions.

last updated: 10 Sep 2018

Activity Data Structures

This activity provides an opportunity to practice working with commonly used Python data structures for scientific data: lists, numpy arrays, and pandas dataframes.

last updated: 10 Sep 2018

Subtract Raster Data in Python Using Numpy and Rasterio

Sometimes you need to manipulate multiple rasters to create a new raster output data set in Python. Learn how to create a CHM by subtracting an elevation raster dataset from a surface model dataset in Python.

last updated: 04 Sep 2019

Open, Plot and Explore Lidar Data in Raster Format with Python

This lesson introduces the raster geotiff file format - which is often used to store lidar raster data. You will learn the 3 key spatial attributes of a raster dataset including Coordinate reference system, spatial extent and resolution.

last updated: 04 Sep 2019

Introduction to Multispectral Remote Sensing Data in Python

Multispectral remote sensing data can be in different resolutions and formats and often has different bands. Learn about the differences between NAIP, Landsat and MODIS remote sensing data as it is used in Python.

last updated: 17 Feb 2020

Reproject Raster Data Python

Sometimes you will work with multiple rasters that are not in the same projections, and thus, need to reproject the rasters, so they are in the same coordinate reference system. Learn how to reproject raster data in Python using Rasterio.

last updated: 30 Mar 2020

Crop Spatial Raster Data With a Shapefile in Python

Sometimes a raster dataset covers a larger spatial extent than is needed for a particular purpose. In these cases, you can crop a raster file to a smaller extent. Learn how to crop raster data using a shapefile and export it as a new raster in open source Python

last updated: 30 Mar 2020

Classify and Plot Raster Data in Python

Reclassifying raster data allows you to use a set of defined values to organize pixel values into new bins or categories. Learn how to classify a raster dataset and export it as a new raster in Python.

last updated: 07 Apr 2020

About the Geotiff (.tif) Raster File Format: Raster Data in Python

Metadata describe the key characteristics of a dataset such as a raster. For spatial data, these characteristics including the coordinate reference system (CRS), resolution and spatial extent. Learn about the use of TIF tags or metadata embedded within a GeoTIFF file to explore the metadata programatically.

last updated: 22 Jan 2020

Plot Histograms of Raster Values in Python

Histograms of raster data provide the distribution of pixel values in the dataset. Learn how to explore and plot the distribution of values within a raster using histograms.

last updated: 22 Jan 2020

Open, Plot and Explore Raster Data with Python

Rasters are gridded data composed of pixels that store values, such as an image or elevation data file. Learn how to open, plot, and explore raster files in Python.

last updated: 30 Mar 2020

Work with MODIS Remote Sensing Data in Python

MODIS is a satellite remote sensing instrument that collects data daily across the globe at 250-500 m resolution. Learn how to import, clean up and plot MODIS data in Python.

last updated: 13 Mar 2020

Compare Lidar to Measured Tree Height

To explore uncertainty in remote sensing data, it is helpful to compare ground-based measurements and data that are collected via airborne instruments or satellites. Learn how to create scatter plots that compare values across two datasets.

last updated: 06 Mar 2020

Extract Raster Values at Point Locations in Python

For many scientific analyses, it is helpful to be able to select raster pixels based on their relationship to a vector dataset (e.g. locations, boundaries). Learn how to extract data from a raster dataset using a vector dataset.

last updated: 13 Mar 2020

Compare Lidar With Human Measured Tree Heights - Remote Sensing Uncertainty

Uncertainty quantifies the range of values within which the value of the measurement falls - within a specified level of confidence. Learn about the types of uncertainty that you can expect when working with tree height data both derived from lidar remote sensing and human measurements and learn about sources of error including systematic vs. random error.

last updated: 06 Mar 2020