Find and Access Earth Science Data Online with R and Python Lessons

Data Management Issues in Science

There is an explosion of free data available online. However, finding and managing these data can be tricky. The lessons below will help you find and download Remote sensing, social media and other data that can be used to better understand earth systems. The lessons walk you through using online website interfaces to get data and API’s to directly access data using a scientific programming tool like R or Python.

Analyze Word Frequency Counts Using Twitter Data and Tweepy in Python

One common way to analyze Twitter data is to calculate word frequencies to understand how often words are used in tweets on a particular topic. To complete any analysis, you need to first prepare the data. Learn how to clean Twitter data and calculate word frequencies using Python.

last updated: 11 Sep 2020

Programmatically Accessing Geospatial Data Using APIs

This lesson walks through the process of retrieving and manipulating surface water data housed in the Colorado Information Warehouse. These data are stored in JSON format with spatial x, y information that support mapping.

last updated: 11 Sep 2020

Introduction to JSON Data in Python

JSON is a powerful text based data format that contains hierarchical data. JSON and GeoJSON are common data formats that are returned when accessing automatically data using an API. Learn more about JSON and GeoJSON data.

last updated: 11 Sep 2020

Introduction to APIs

API's allow you to automate access and downloading data in your code to support open reproducible science. Learn how how to use API's to download data from the internet using open source python.

last updated: 01 Apr 2021

Reproject Raster Data Python

Sometimes you will work with multiple rasters that are not in the same projections, and thus, need to reproject the rasters, so they are in the same coordinate reference system. Learn how to reproject raster data in Python using Rasterio.

last updated: 09 Nov 2020

About the Geotiff (.tif) Raster File Format: Raster Data in Python

Metadata describe the key characteristics of a dataset such as a raster. For spatial data, these characteristics including the coordinate reference system (CRS), resolution and spatial extent. Learn about the use of TIF tags or metadata embedded within a GeoTIFF file to explore the metadata programatically.

last updated: 19 Aug 2021

Creating Interactive Spatial Maps in R Using Leaflet

This lesson covers the basics of creating an interactive map using the leaflet API in R. We will import data from the Colorado Information warehouse using the SODA RESTful API and then create an interactive map that can be published to an HTML formatted file using knitr and rmarkdown.

last updated: 03 Sep 2019

Introduction to the JSON data structure

This lesson covers the JSON data structure. JSON is a powerful text based format that supports hierarchical data structures. It is the core structure used to create geoJSON which is a spatial version of json that can be used to create maps. JSON is preferred for use over .csv files for data structures as it has been proven to be more efficient - particulary as data size becomes large.

last updated: 03 Sep 2019

Introduction to APIs

In this module, you learn various ways to access, download and work with data programmatically. These methods include downloading text files directly from a website onto your computer and into R, reading in data stored in text format from a website, into a data.frame in R and finally, accessing subsets of particular data using REST API calls in R.

last updated: 30 Mar 2020

How to Open and Use Files in Geotiff Format

A GeoTIFF is a standard file format with spatial metadata embedded as tags. Use the raster package in R to open geotiff files and spatial metadata programmatically.

last updated: 03 Sep 2019

How to Address Missing Values in R

Missing data in R can be caused by issues in data collection and / or processing and presents challenges in data analysis. Learn how to address missing data values in R.

last updated: 03 Sep 2019