Lesson 4. How to Replace Raster Cell Values with Values from A Different Raster Data Set in Python


Learning Objectives

  • Replace (masked) values in one xarray DataArray with values in another array.

Sometimes you have many bad pixels in a landsat scene that you wish to replace or fill in with pixels from another scene. In this lesson you will learn how to replace pixels in one scene with those from another using Xarray.

To begin, open both of the pre-fire raster stacks. You got the cloud free data as a part of your homework, last week. The scene with the cloud is in the cold spring fire data that you downloaded last week.

import os
from glob import glob

import matplotlib.pyplot as plt
from matplotlib import patches as mpatches
from matplotlib.colors import ListedColormap
import seaborn as sns
import numpy as np
from numpy import ma
from shapely.geometry import box
import xarray as xr
import rioxarray as rxr
import earthpy as et
import earthpy.spatial as es
import earthpy.plot as ep
import earthpy.mask as em

# Prettier plotting with seaborn
sns.set_style('white')
sns.set(font_scale=1.5)

# Download data and set working directory
data = et.data.get_data('cold-springs-fire')
data_2 = et.data.get_data('cs-test-landsat')
os.chdir(os.path.join(et.io.HOME, 'earth-analytics', 'data'))
Downloading from https://ndownloader.figshare.com/files/10960214?private_link=fbba903d00e1848b423e
Extracted output to /root/earth-analytics/data/cs-test-landsat/.

First, import the landsat rasters and mask out the clouds like you did in the previous lesson.

# Custom function to read in list of tifs into an xarray object
def combine_tifs(tif_list):
    """A function that combines a list of tifs in the same CRS
    and of the same extent into an xarray object

    Parameters
    ----------
    tif_list : list
        A list of paths to the tif files that you wish to combine.
        
    Returns
    -------
    An xarray object with all of the tif files in the listmerged into 
    a single object.

    """

    out_xr=[]
    for i, tif_path in enumerate(tif_list):
        out_xr.append(rxr.open_rasterio(tif_path, masked=True).squeeze())
        out_xr[i]["band"]=i+1
     
    return xr.concat(out_xr, dim="band") 

# Stack the Landsat pre fire data
landsat_paths_pre_path = os.path.join("cold-springs-fire", 
                                      "landsat_collect",
                                      "LC080340322016070701T1-SC20180214145604", 
                                      "crop",
                                      "*band[2-4]*.tif")

landsat_paths_pre = glob(landsat_paths_pre_path)
landsat_paths_pre.sort()

landsat_pre_cloud = combine_tifs(landsat_paths_pre)


# Calculate bounds object
landsat_pre_cloud_ext_bds = landsat_pre_cloud.rio.bounds()

# Open the pixel_qa layer for your landsat scene
landsat_pre_cl_path = os.path.join("cold-springs-fire", 
                                   "landsat_collect",
                                   "LC080340322016070701T1-SC20180214145604", 
                                   "crop",
                                   "LC08_L1TP_034032_20160707_20170221_01_T1_pixel_qa_crop.tif")

landsat_qa = rxr.open_rasterio(landsat_pre_cl_path).squeeze()

# Generate array of all possible cloud / shadow values
cloud_shadow = [328, 392, 840, 904, 1350]
cloud = [352, 368, 416, 432, 480, 864, 880, 928, 944, 992]
high_confidence_cloud = [480, 992]

vals_to_mask = cloud_shadow + cloud + high_confidence_cloud

# Call the earthpy mask function using pixel QA layer
landsat_pre_cloud_masked = landsat_pre_cloud.where(~landsat_qa.isin(vals_to_mask))

Plot the data to ensure that the cloud covered pixels are masked.

# Masking out NA values with numpy in order to plot with ep.plot_rgb
landsat_pre_cloud_masked_plot = ma.masked_array(landsat_pre_cloud_masked.values, landsat_pre_cloud_masked.isnull())

ep.plot_rgb(landsat_pre_cloud_masked_plot,
            rgb=[2, 1, 0],
            title="Masked Landsat Image | 30 meters \n Post Cold Springs Fire \n July 8, 2016")
plt.show()
Plotting the image and the mask together to ensure the mask does indeed cover the cloud in the image.
Plotting the image and the mask together to ensure the mask does indeed cover the cloud in the image.

Read and Stack Cloud Free Data

Next, read in and stack the cloud free landsat data. Below you access the bounds object of a rioxarray object with xarray_name.rio.bounds(). This contains the spatial extent of the cloud free raster. You will use this to ensure that the bounds of both datasets are the same before replacing pixel values.

# Read in the "cloud free" landsat data that you downloaded as a part of your homework
landsat_paths_pre_cloud_free = glob(
    os.path.join("cs-test-landsat", "*band[2-4]*.tif"))

landsat_paths_pre_cloud_free.sort()

landsat_pre_cloud_free = combine_tifs(landsat_paths_pre_cloud_free)


# Calculate bounds object
landsat_no_clouds_bds = landsat_pre_cloud_free.rio.bounds()
# Are the bounds the same?
landsat_no_clouds_bds == landsat_pre_cloud_ext_bds
False
# Make polygons from the bounds
cloud_free_scene_bds = box(*landsat_no_clouds_bds)
cloudy_scene_bds = box(*landsat_pre_cloud_ext_bds)

# Do the data overlap spatially?
cloud_free_scene_bds.intersects(cloudy_scene_bds)
True
# Plot the boundaries
x, y = cloud_free_scene_bds.exterior.xy
x1, y1 = cloudy_scene_bds.exterior.xy

fig, ax = plt.subplots(1, 1, figsize=(8, 6))

ax.plot(x, y, color='#6699cc', alpha=0.7,
        linewidth=3, solid_capstyle='round', zorder=2)

ax.plot(x1, y1, color='purple', alpha=0.7,
        linewidth=3, solid_capstyle='round', zorder=2)

ax.set_title('Are the spatial extents different?')

plt.show()
Overlapping spatial extents of the masked Landsat image and the image that will be used to fill in the masked values.
Overlapping spatial extents of the masked Landsat image and the image that will be used to fill in the masked values.
# Is the CRS the same in each raster?
landsat_pre_cloud.rio.crs == landsat_pre_cloud_free.rio.crs
True
# Are the shapes the same?
landsat_pre_cloud.shape == landsat_pre_cloud_free.shape
False

You’ve now determined that

  1. the data do not have the same bounds
  2. the data are in the same Coordinate Reference System and
  3. the data do overlap (or intersect).

Since the two images do not cover the same spatial extent, the next step is to CROP the cloud-free data (which has a larger spatial extent) to the spatial extent of the cloudy data so we can then reassign all cloud covered pixels to the values in the cloud free data (in the same location).

landsat_clouds_clip = es.extent_to_json(list(landsat_pre_cloud_ext_bds))
# Clip the data to the extent of the other landsat scene

landsat_pre_cloud_free = landsat_pre_cloud_free.rio.clip([landsat_clouds_clip])
# View the shape of each scene. are they the same?
landsat_pre_cloud_free.shape, landsat_pre_cloud_masked.shape
((3, 177, 246), (3, 177, 246))

Once the data are cropped to the same extent, you can replace values using xarray’s where() function.

# Get the mask layer from the pre_cloud data
mask = landsat_pre_cloud_masked.isnull()

# Assign every cell in the new array that is masked
# to the value in the same cell location as the cloud free data
landsat_pre_cloud_masked_val_replace = xr.where(mask, landsat_pre_cloud_free, landsat_pre_cloud_masked)

Finally, plot the data. Does it look like it reassigned values correctly?

# Masking out NA values with numpy in order to plot with ep.plot_rgb
landsat_pre_cloud_masked_val_replace_plot = ma.masked_array(landsat_pre_cloud_masked_val_replace.values, landsat_pre_cloud_masked_val_replace.isnull())

ep.plot_rgb(landsat_pre_cloud_masked_val_replace_plot,
            rgb=[2, 1, 0],
            title="Masked Landsat CIR Composite Image | 30 meters \n Post Cold Springs Fire \n July 8, 2016")
plt.show()
Landsat CIR Composite image after replacement of masked pixel values using a cloud-free image for the post-Cold Springs fire.
Landsat CIR Composite image after replacement of masked pixel values using a cloud-free image for the post-Cold Springs fire.

The above answer is not perfect! You can see that the boundaries of the masked area are still visible. Also there are dark shadowed pixels that were not replaced given the raster pixel_qa layer did not assign those as pixels to be masked. Thus you may need to do a significant amount of further analysis to get this image to where you’d like it to be. But you at least have a start at getting there!

In the case of this class, a large enough portion of the study area is covered by clouds that it makes more sense to find a new scene with cloud cover. However, it is good to understand how to replace pixel values in the case that you may need to do so for smaller areas in the future.

Leave a Comment