Find and Access Earth Science Data Online with R and Python Lessons

Data Management Issues in Science

There is an explosion of free data available online. However, finding and managing these data can be tricky. The lessons below will help you find and download Remote sensing, social media and other data that can be used to better understand earth systems. The lessons walk you through using online website interfaces to get data and API’s to directly access data using a scientific programming tool like R or Python.

Create Word Frequency Counts and Sentiments Using Twitter Data and Tweepy in Python

One common way to analyze Twitter data is to calculate word frequencies to understand how often words are used in tweets on a particular topic. Another common task is to analyze sentiment in Twitter data. Both require some data cleanup. Learn how to use clean twitter data, calculate word frequencies, and analyze sentiments in Python.

last updated: 28 Nov 2018

Programmatically Accessing Geospatial Data Using APIs

This lesson walks through the process of retrieving and manipulating surface water data housed in the Colorado Information Warehouse. These data are stored in JSON format with spatial x, y information that support mapping.

last updated: 28 Nov 2018

Introduction to the JSON data structure

This lesson covers the JSON data structure. JSON is a powerful text based format that supports hierarchical data structures. It is the core structure used to create GeoJSON which is a spatial version of JSON that can be used to create maps. JSON is preferred for use over CSV files for data structures, as it has been proven to be more efficient - particulary as data size becomes large.

last updated: 28 Nov 2018

Introduction to APIs

In this module, you learn various ways to access, download and work with data programmatically. These methods include downloading text files directly from a website onto your computer and into Python, reading in data stored in text format from a website into a DataFrame in Python, and finally, accessing subsets of particular data using REST API calls in Python.

last updated: 28 Nov 2018

About the Geotiff (.tif) Raster File Format: Raster Data in Python

This lesson introduces the geotiff file format. Further it introduces the concept of metadata - or data about the data. Metadata describe key characteristics of a data set. For spatial data these characteristics including CRS, resolution and spatial extent. Here you learn about the use of tif tags or metadata embedded within a geotiff file as they can be used to explore data programatically.

last updated: 25 Sep 2018

Creating Interactive Spatial Maps in R Using Leaflet

This lesson covers the basics of creating an interactive map using the leaflet API in R. We will import data from the Colorado Information warehouse using the SODA RESTful API and then create an interactive map that can be published to an HTML formatted file using knitr and rmarkdown.

last updated: 10 Jan 2018

Introduction to the JSON data structure

This lesson covers the JSON data structure. JSON is a powerful text based format that supports hierarchical data structures. It is the core structure used to create geoJSON which is a spatial version of json that can be used to create maps. JSON is preferred for use over .csv files for data structures as it has been proven to be more efficient - particulary as data size becomes large.

last updated: 10 Jan 2018

Introduction to APIs

In this module, you learn various ways to access, download and work with data programmatically. These methods include downloading text files directly from a website onto your computer and into R, reading in data stored in text format from a website, into a data.frame in R and finally, accessing subsets of particular data using REST API calls in R.

last updated: 10 Jan 2018

How to Open and Use Files in Geotiff Format

A GeoTIFF is a standard file format with spatial metadata embedded as tags. Use the raster package in R to open geotiff files and spatial metadata programmatically.

last updated: 10 Jan 2018

How to Address Missing Values in R

Missing data in R can be caused by issues in data collection and / or processing and presents challenges in data analysis. Learn how to address missing data values in R.

last updated: 10 Jan 2018