Max Joseph

Max Joseph has contributed to the materials listed below. Max is a data scientist with the Analytics Hub at Earth Lab and maintains this website.

Course Lessons

Course lessons are developed as a part of a course curriculum. They teach specific learning objectives associated with data and scientific programming. Max Joseph has contributed to the following lessons:

Practice Forking a GitHub Repository and Submitting Pull Requests

A pull request allows anyone to suggest changes to a repository on GitHub that can be easily reviewed by others. Learn how to submit pull requests on GitHub.com to suggest changes to a GitHub repository.

An Example of a Github Collaborative Workflow for Team Science

GitHub.com can be used to store and access files in the cloud using GitHub repositories. Learn how to submit pull requests on GitHub.com to suggest changes to a GitHub repository.

Track, Manage and Discuss Project Changes and Updates Using GitHub Issues

An issue is a GitHub project management tool that allows anyone to identify and discuss potential changes to a repo. Learn how to create and manage GitHub issues to support collaborative open reproducible science projects.

Sync a GitHub Repo: How To Ensure Your GitHub Fork Is Up To Date

When you are working on a forked GitHub repository you will need to update your files frequently. Learn how to update your GitHub fork using a reverse pull request.

How To Create A Pull Request on Github: Propose Changes to GitHub Repositories

A pull request allows anyone to suggest changes to a repository on GitHub that can be easily reviewed by others. Learn how to submit pull requests on GitHub.com to suggest changes to a GitHub repository.

Learn How To Use GitHub to Collaborate on Open Science Projects

GitHub is a website that supports git-based version control and collaborative project management. Learn how to use git and GitHub to collaborate on projects in support of open reproducible science.

How To Organize Your Project: Best Practices for Open Reproducible Science

Open reproducible science refers to developing workflows that others can easily understand and use. Learn about best practices for organizing open reproducible science projects including the use of machine readable names.

Tools For Open Reproducible Science

Key tools for open reproducible science include Shell (Bash), git and GitHub, Jupyter, and Python. Learn how these tools help you implement open reproducible science workflows.

What Is Open Reproducible Science

Open reproducible science refers to developing workflows that others can easily understand and use. It enables you to build on others' work rather than starting from scratch. Learn about the importance and benefits of open reproducible science.

How Do You Design and Automate a Data Workflow

Designing and developing data workflows can help you complete your work more efficiently by allowing you to repeat and automate data tasks. Learn how to design and develop efficient workflows to automate data analyses in Python.

Learn to Write Pseudocode for Python Programming

Pseudcode can help you design data workflows through listing out the individual steps of workflow in plain language, so the focus is on the overall data process, rather than on the specific code needed. Learn best practices for writing pseudocode for data workflows.

Data Workflow Best Practices - Things to Consider When Processing Data

Identifying aspects of a workflow that can be modularized and tested can help you design efficient and effective data workflows. Learn best practices for designing efficient data workflows.

About the ReStructured Text Format - Introduction to .rst

Restructured text (RST) is a text format similar to markdown that is often used to document python software. Learn how create headings, lists and code blocks in a text file using RST syntax.

Introduction to Documenting Python Software

Lack of documentation will limit peoples’ use of your code. In this lesson you will learn about 2 ways to document python code using docstrings and online documentation. YOu will also learn how to improve documentation in other software packages.

The GitHub Workflow - How to Contribute To Open Source Software

Open source means that you can view and contribute to software code like packages you use in Python. Learn about the ways that you can contribute without being an expert progammer.

Introduction to Open Source Software - What Is It and How Can You Help?

Open source means that you can view and contribute to software code like packages you use in Python. Learn about the ways that you can contribute without being an expert progammer.

Practice Using Git and GitHub to Manage Files

Practice your skills setting up git locally, committing changes to files and pushing and pulling files to GitHub.com

Undo Local Changes With Git

A version control system allows you to track and manage changes to your files. Learn how to undo changes in git after they have been added or committed to version control.

Get Started with Git Commands for Version Control

A version control system allows you to track and manage changes to your files. Learn how to use some basic Git commands including add, commit and push.

How To Setup Git Locally On Your Computer

Learn how to setup git locally on your computer.

What Is Version Control

A version control system allows you to track and manage changes to your files. Learn benefits of version control for scientific workflows and how git and GitHub.com support version control.

Guided Activity on Git/Github.com For Collaboration

This lesson teaches you how to collaborate with others in a project, including tasks such as notifying others that an assigned task has been completed.

Guided Activity on Undo Changes in Git

This lesson teaches you how to undo changes in Git after they have been added or committed.

Interactive Maps in Python

Folium is a Python package that can be used to create interactive maps in Jupyter Notebook. Learn how to create interactive maps with raster overlays in Python using Folium.

Programmatically Accessing Geospatial Data Using APIs

This lesson walks through the process of retrieving and manipulating surface water data housed in the Colorado Information Warehouse. These data are stored in JSON format with spatial x, y information that support mapping.

Introduction to Working With JSON Data in Open Source Python

This lesson introduces how to work with the JSON data structure using Python using the JSON and Pandas libraries to create and convert JSON objects.

Introduction to JSON Data in Python

JSON is a powerful text based data format that contains hierarchical data. JSON and GeoJSON are common data formats that are returned when accessing automatically data using an API. Learn more about JSON and GeoJSON data.

Introduction to APIs

API's allow you to automate access and downloading data in your code to support open reproducible science. Learn how how to use API's to download data from the internet using open source python.

Challenge Yourself

This lesson contains a series of challenges that require using tidyverse functions in R to process data.

Automate Workflows Using Loops in R

When you are programming, it can be easy to copy and paste code that works. However this approach is not efficient. Learn how to create for-loops to process multiple files in R.

Handle Missing Data in R

Learn how to handle missing data in the R programming language.

Use tidyverse group_by and summarise to Manipulate Data in R

Learn how to write pseudocode to plan our your approach to working with data. Then use tidyverse functions including group_by and summarise to implement your plan.

Get Started with Clean Coding in R

Learn...

Learn to Use tidyverse and Clean Code to Work With Data in R

When working with data, you often spend the most amount of time cleaning your data. Learn how to write more efficient code using the tidyverse in R.

Submit a pull request on the GitHub website

Learn how to create and submit a pull request to another repo.

How to fork a repo in GitHub

Learn how to fork a repository using the GitHub website.

Introduction to undoing things in git

Learn how to undo changes in git after they have been added or committed.

First steps with git: clone, add, commit, push

Learn basic git commands, including clone, add, commit, and push.

An introduction version control

Learn what version control is, and how Git and GitHub are used in a typical version control workflow.

Programmatically Accessing Geospatial Data Using API's - Working with and Mapping JSON Data from the Colorado Information Warehouse in R

This lesson walks through the process of retrieving and manipulating surface water data housed in the Colorado Information Warehouse. These data are stored in JSON format with spatial x, y information that support mapping.

Programmatically Access Data Using an API in R - The Colorado Information Warehouse

This lesson covers accessing data via the Colorado Information Warehouse SODA API in R.

Introduction to the JSON data structure

This lesson covers the JSON data structure. JSON is a powerful text based format that supports hierarchical data structures. It is the core structure used to create geoJSON which is a spatial version of json that can be used to create maps. JSON is preferred for use over .csv files for data structures as it has been proven to be more efficient - particulary as data size becomes large.

Access Secure Data Connections Using the RCurl R Package.

This lesson reviews how to use functions within the RCurl package to access data on a secure (https) server in R.

An Example of Creating Modular Code in R - Efficient Scientific Programming

This lesson provides an example of modularizing code in R.

Introduction to APIs

In this module, you learn various ways to access, download and work with data programmatically. These methods include downloading text files directly from a website onto your computer and into R, reading in data stored in text format from a website, into a data.frame in R and finally, accessing subsets of particular data using REST API calls in R.

Use lapply in R Instead of For Loops to Process .csv files - Efficient Coding in R

Learn how to take code in a for loop and convert it to be used in an apply function. Make your R code more efficient and expressive programming.

If Statements, Functions, and For Loops

Learn how to combine if statements, functions and for loops to process sets of text files.

Create For Loops

Learn how to write a for loop to process a set of .csv format text files in R.

Working with Function Arguments

Learn how to work with function arguments in the R programming language..

Get to Know the Function Environment & Function Arguments in R

This lesson introduces the function environment and documenting functions in R. When you run a function intermediate variables are not stored in the global environment. This not only saves memory on your computer but also keeps our environment clean, reducing the risk of conflicting variables.

How to Write a Function in R - Automate Your Science

Learn how to write a function in the R programming language.

What Could be Improved In this R Code?

Write Efficient Scientific Code - the DRY (Don't Repeat Yourself) Principle

This lesson will cover the basic principles of using functions and why they are important.

Use Regression Analysis to Explore Data Relationships & Bad Data

You often want to understand the relationships between two different types of data. Learn how to use regression to determine whether there is a relationship between two variables.

Data tutorials