Lesson 1. Work with Landsat Remote Sensing Data in Python


Learn How to Work With Landsat Multispectral Remote Sensing Data in Python - Intermediate earth data science textbook course module

Welcome to the first lesson in the Learn How to Work With Landsat Multispectral Remote Sensing Data in Python module. Learn how to work with Landsat multi-band raster data stored in .tif format in Python using Rasterio

Chapter Nine - Landsat Data in Open Source Python

In this chapter, you will learn how to work with Landsat multi-band raster data stored in .tif format in Python using rasterio.

Learning Objectives

After completing this chapter, you will be able to:

  • Use glob() to create a subsetted list of file names within a specified directory on your computer.
  • Create a raster stack from a list of .tif files in Python.
  • Crop rasters to a desired extent in Python.
  • Plot various band combinations using a numpy array in Python.

What You Need

You will need a computer with internet access to complete this lesson and the Cold Springs Fire data.

Download Cold Springs Fire Data (~150 MB)

Or use earthpy et.data.get_data('cold-springs-fire')

In the NAIP data chapter in this textbook, you learned how to import a multi-band image into Python using rasterio. You then plotted the data as a composite, RGB (and CIR) image using imshow() and calculated NDVI.

In that case, all bands of the data were stored in a single .tif file. However, sometimes data are downloaded in individual bands rather than a single file.

In this chapter, you will learn how to work with Landsat data in Python. Each band in a landsat scene is often stored in an individual .tif file. Thus you will need to grab the bands that you want to work with and then bring them into a numpy array.

About Landsat Data

At over 40 years, the Landsat series of satellites provides the longest temporal record of moderate resolution multispectral data of the Earth’s surface on a global basis. The Landsat record has remained remarkably unbroken, proving a unique resource to assist a broad range of specialists in managing the world’s food, water, forests, and other natural resources for a growing world population. It is a record unmatched in quality, detail, coverage, and value. Source: USGS

Landsat 40 year timeline source: USGS.
The 40 year history of landsat missions. Source: USGS - USGS Landsat Timeline

Landsat data are spectral and collected using a platform mounted on a satellite in space that orbits the earth. The spectral bands and associated spatial resolution of the first 9 bands in the Landsat 8 sensor are listed below.

Landsat 8 Bands

BandWavelength range (nanometers)Spatial Resolution (m)Spectral Width (nm)UnitsData TypeFill Value (no data)RangeValid RangeScale Factor
Band 1 - Coastal aerosol430 - 450302.0Reflectance16-bit signed integer (int16)-9999-2000 to 160000 to 100000.0001
Band 2 - Blue450 - 510306.0Reflectance16-bit signed integer (int16)-9999-2000 to 160000 to 100000.0001
Band 3 - Green530 - 590306.0Reflectance16-bit signed integer (int16)-9999-2000 to 160000 to 100000.0001
Band 4 - Red640 - 670300.03Reflectance16-bit signed integer (int16)-9999-2000 to 160000 to 100000.0001
Band 5 - Near Infrared (NIR)850 - 880303.0Reflectance16-bit signed integer (int16)-9999-2000 to 160000 to 100000.0001
Band 6 - SWIR 11570 - 1650308.0Reflectance16-bit signed integer (int16)-9999-2000 to 160000 to 100000.0001
Band 7 - SWIR 22110 - 22903018Reflectance16-bit signed integer (int16)-9999-2000 to 160000 to 100000.0001

Review the Landsat 8 Surface Reflectance Product Guide for more details.

There are additional collected bands that are not distributed within the Landsat 8 Surface Reflectance Product such as the panchromatic band, which provides a finer resolution, gray scale image of the landscape, and the cirrus cloud band, which is used in the quality assessment process:

BandWavelength range (nanometers)Spatial Resolution (m)Spectral Width (nm)
Band 8 - Panchromatic500 - 6801518
Band 9 - Cirrus1360 - 1380302.0

Understand Landsat Data

When working with landsat, it is important to understand both the metadata and the file naming convention. The metadata tell you how the data were processed, where the data are from and how they are structured.

The file names, tell you what sensor collected the data, the date the data were collected, and more.

landsat file naming convention
Landsat file names Source: USGS Landsat - Landsat Scene Naming Conventions

Landsat File Naming Convention

Landsat and many other satellite remote sensing data is named in a way that tells you a about:

  • When the data were collected and processed
  • What sensor was used to collect the data
  • What satellite was used to collect the data.

And more.

Here you will learn a few key components of the landsat 8 collection file name. The first scene that you work with below is named:

LC080340322016072301T1-SC20180214145802

First, we have LC08

  • L: Landsat Sensor
  • C: OLI / TIRS combined platform
  • 08: Landsat 8 (not 7)

  • 034032: The next 6 digits represent the path and row of the scene. This identifies the spatial coverage of the scene

Finally, you have a date. In your case as follows:

  • 20160723: representing the year, month and day that the data were collected.

The second part of the file name above tells you more about when the data were last processed. You can read more about this naming convention using the link below.

Learn more about Landsat 8 file naming conventions.

As you work with these data, it is good to double check that you are working with the sensor (Landsat 8) and the time period that you intend. Having this information in the file name makes it easier to keep track of this as you process your data.

import os
from glob import glob # File manipulation
import matplotlib.pyplot as plt
import numpy as np
import geopandas as gpd
import rasterio as rio
import earthpy as et
import earthpy.spatial as es
import earthpy.plot as ep

# Download data and set working directory
data = et.data.get_data('cold-springs-fire')
os.chdir(os.path.join(et.io.HOME, 'earth-analytics', 'data'))
Downloading from https://ndownloader.figshare.com/files/10960109
Extracted output to /root/earth-analytics/data/cold-springs-fire/.
# Get list of all pre-cropped data and sort the data

# Create the path to your data
landsat_post_fire_path = os.path.join("cold-springs-fire", 
                    "landsat_collect", 
                    "LC080340322016072301T1-SC20180214145802", 
                    "crop")

# Generate a list of just the tif files
post_fire_tifs_list = glob(os.path.join(landsat_post_fire_path, 
                                        "*band*.tif"))

# Sort the data to ensure bands are in the correct order
post_fire_tifs_list.sort()
post_fire_tifs_list
['cold-springs-fire/landsat_collect/LC080340322016072301T1-SC20180214145802/crop/LC08_L1TP_034032_20160723_20180131_01_T1_sr_band1_crop.tif',
 'cold-springs-fire/landsat_collect/LC080340322016072301T1-SC20180214145802/crop/LC08_L1TP_034032_20160723_20180131_01_T1_sr_band2_crop.tif',
 'cold-springs-fire/landsat_collect/LC080340322016072301T1-SC20180214145802/crop/LC08_L1TP_034032_20160723_20180131_01_T1_sr_band3_crop.tif',
 'cold-springs-fire/landsat_collect/LC080340322016072301T1-SC20180214145802/crop/LC08_L1TP_034032_20160723_20180131_01_T1_sr_band4_crop.tif',
 'cold-springs-fire/landsat_collect/LC080340322016072301T1-SC20180214145802/crop/LC08_L1TP_034032_20160723_20180131_01_T1_sr_band5_crop.tif',
 'cold-springs-fire/landsat_collect/LC080340322016072301T1-SC20180214145802/crop/LC08_L1TP_034032_20160723_20180131_01_T1_sr_band6_crop.tif',
 'cold-springs-fire/landsat_collect/LC080340322016072301T1-SC20180214145802/crop/LC08_L1TP_034032_20160723_20180131_01_T1_sr_band7_crop.tif']
# Create an output array of all the landsat data stacked
landsat_post_fire_path = os.path.join("cold-springs-fire",
                                      "outputs", 
                                      "landsat_post_fire.tif")

# This will create a new stacked raster with all bands
landsat_post_fire_arr, land_meta = es.stack(post_fire_tifs_list,
                                 landsat_post_fire_path)
# View output numpy array
landsat_post_fire_arr
array([[[ 446,  476,  487, ...,  162,  220,  260],
        [ 393,  457,  488, ...,  200,  235,  296],
        [ 364,  393,  388, ...,  246,  298,  347],
        ...,
        [ 249,  283,  363, ...,  272,  268,  284],
        [ 541,  474,  364, ...,  260,  269,  285],
        [ 219,  177,  250, ...,  271,  271,  286]],

       [[ 515,  547,  572, ...,  181,  233,  261],
        [ 440,  519,  571, ...,  211,  251,  322],
        [ 411,  460,  449, ...,  264,  326,  387],
        ...,
        [ 387,  326,  427, ...,  288,  278,  301],
        [ 554,  654,  433, ...,  276,  276,  293],
        [ 291,  174,  291, ...,  292,  290,  304]],

       [[ 782,  772,  843, ...,  335,  390,  411],
        [ 684,  771,  836, ...,  363,  412,  511],
        [ 656,  725,  706, ...,  425,  518,  599],
        ...,
        [ 685,  588,  718, ...,  422,  438,  470],
        [ 881,  909,  680, ...,  412,  431,  468],
        [ 464,  289,  427, ...,  408,  435,  484]],

       ...,

       [[2445, 2271, 2417, ..., 1734, 1904, 2101],
        [2662, 2465, 2532, ..., 1736, 1824, 2165],
        [2880, 2872, 2750, ..., 1897, 2116, 2300],
        ...,
        [1900, 1917, 2076, ..., 1722, 1891, 1890],
        [1779, 1893, 1983, ..., 1645, 1847, 2090],
        [1553, 1440, 1587, ..., 1562, 1689, 1964]],

       [[2864, 2974, 3108, ...,  983, 1195, 1271],
        [2527, 2827, 3008, ..., 1132, 1293, 1546],
        [2141, 2427, 2433, ..., 1324, 1652, 1922],
        ...,
        [1662, 1757, 1922, ..., 1463, 1472, 1519],
        [1786, 1532, 1554, ..., 1374, 1423, 1450],
        [1071,  943,  975, ..., 1524, 1461, 1518]],

       [[1920, 1979, 2098, ...,  537,  660,  687],
        [1505, 1863, 1975, ...,  651,  747,  924],
        [1240, 1407, 1391, ...,  769, 1018, 1189],
        ...,
        [1216, 1190, 1398, ...,  877,  890,  928],
        [1517, 1184, 1078, ...,  846,  810,  820],
        [ 660,  593,  623, ...,  984,  909,  880]]], dtype=int16)
# Plot all bands
ep.plot_bands(landsat_post_fire_arr)
plt.show()

Plot RGB image

Just like you did with NAIP data, you can plot 3 band color composite images for Landsat using the earthpy ep.plot_rgb() function. Refer to the landsat bands in the table at the top of this page to figure out the red, green and blue bands. Or read the ESRI landsat 8 band combinations post.

ep.plot_rgb() requires:

  1. the numpy array containing the bands that you wish to plot. IMPORTANT: this array should be in rasterio band order (bands first).

  2. The numeric location of bands that you wish to plot in the array.

ep.plot_rgb(landsat_post_fire_arr,
            rgb=[3, 2, 1],
            title="RGB Composite Image\n Post Fire Landsat Data")
plt.show()
Landsat 8 3 band color RGB composite.
Landsat 8 3 band color RGB composite.

Notice that the image above looks dark. You can stretch the image as you did with the NAIP data, too. Below you use the stretch argument built into the earthpy plot_rgb() function. The str_clip argument allows you to specify how much of the tails of the data that you want to clip off. The larger the number, the most the data will be stretched or brightened.

When the range of pixel brightness values is closer to 0, a darker image is rendered by default. You can stretch the values to extend to the full 0-255 range of potential values to increase the visual contrast of the image.
When the range of pixel brightness values is closer to 0, a darker image is rendered by default. You can stretch the values to extend to the full 0-255 range of potential values to increase the visual contrast of the image.
When the range of pixel brightness values is closer to 255, a lighter image is rendered by default. You can stretch the values to extend to the full 0-255 range of potential values to increase the visual contrast of the image.
When the range of pixel brightness values is closer to 255, a lighter image is rendered by default. You can stretch the values to extend to the full 0-255 range of potential values to increase the visual contrast of the image.
ep.plot_rgb(landsat_post_fire_arr,
            rgb=[3, 2, 1],
            title="Landsat RGB Image\n Linear Stretch Applied",
            stretch=True,
            str_clip=1)
plt.show()
Landsat 3 band RGB color composite with stretch applied.
Landsat 3 band RGB color composite with stretch applied.
# Adjust the amount of linear stretch to futher brighten the image
ep.plot_rgb(landsat_post_fire_arr,
            rgb=[3, 2, 1],
            title="Landsat RGB Image\n Linear Stretch Applied",
            stretch=True,
            str_clip=4)
plt.show()
Landsat 3 band RGB color composite with stretch and more clip applied.
Landsat 3 band RGB color composite with stretch and more clip applied.

Raster Pixel Histograms

You can create a histogram to view the distribution of pixel values in the rgb bands plotted above.

# Plot all band histograms using earthpy
band_titles = ["Band 1", "Blue", "Green", "Red",
               "NIR", "Band 6", "Band7"]

ep.hist(landsat_post_fire_arr,
        title=band_titles)

plt.show()
Landsat 8 histogram for each band.
Landsat 8 histogram for each band.

Plot CIR

Now you’ve created a red, green blue color composite image. Remember red green and blue are colors that your eye can see.

Next, create a color infrared image (CIR) using landsat bands: 4,3,2.

ep.plot_rgb(landsat_post_fire_arr, 
            rgb=[4, 3, 2],
            title="CIR Landsat Image Pre-Cold Springs Fire",
            figsize=(10, 10))
plt.show()
Landsat 8 CIR color composite image.
Landsat 8 CIR color composite image.

Data Tip: Landsat 8 Pre Collections Data

If you are working with Landsat data downloaded pre USGS collections, your data may be formatted and named slightly differently than the example shown on this page. Below is an explanation of the legacy Landsat 8 naming convention.

File: LC08_L1TP_034032_20160707_20170221_01_T1_sr_band1_crop.tif

SensorSensorSatelliteWRS pathWRS row    
LC8034032201607072017022101
LandsatOLI & TIRSLandsat 8path = 034row = 032year = 2016Month = 7, day = 7Processing Date: 2017-02-21 (feb 21 2017)Archive (second version): 01
  • L: Landsat
  • X: Sensor
    • C = OLI & TIRS O = OLI only T = IRS only E = ETM+ T = TM M = MSS
  • S Satelite
  • PPP
  • RRR
  • YYYY = Year
  • DDD = Julian DAY of the year
  • GSI - Ground station ID
  • VV = Archive Version

More here breaking down the file name.

Julian day

We won’t spend a lot of time on Julian days. The julian day used to be used in Landsat pre collections file naming. However recently they have switched to a normal year-month-date format

See this link that provide tables to help you convert julian days to actual date.

</div>

Leave a Comment