Lesson 1. What Is Open Reproducible Science


Get Started with Open Reproducible Science With Bash, Git, and Jupyter Notebook - Earth analytics bootcamp course module

Welcome to the first lesson in the Get Started with Open Reproducible Science With Bash, Git, and Jupyter Notebook module. This tutorial helps you get started with open reproducible science and introduces you to tools used in open reproducible science workflows including Bash/Shell, Git and Github.com, and Python in Jupyter Notebook.

In this lesson, you will learn the importance and benefits of open reproducible science as well as learn how open workflows support open reproducible science.

Learning Objectives

After completing this lesson, you will be able to:

  • Explain the importance of reproducibility in science
  • Explain how open science workflows results in open reproducible science
  • List tools that help you implement open science workflows

What You Need

You will need a web browser to play the video and review the slides in this lesson.

Watch Video About the Importance of Reproducibility in Science

First, watch a short video to learn more about the importance of reproducibility and the current reproducibility “crisis” in science.

Reproducibility in Science

Review Slides About Open Science Workflows

Next, review slides describing how open science workflows can support and promote reproducibility, resulting in open reproducible science.

View Slideshow: Share, Publish & Archive Code & Data

Useful Tools in the Open Reproducible Science Toolbox

Open reproducible science results from open science workflows that allow you to easily collaborate with others and openly publish your workflows to contribute to greater science knowledge.

This figure presents a useful illustration of the open science workflow, highlighting the roles of data, code, and workflows. Source: Max Joseph, Earth Lab at University of Colorado, Boulder.
This figure presents a useful illustration of the open science workflow, highlighting the roles of data, code, and workflows. Source: Max Joseph, Earth Lab at University of Colorado, Boulder.

To implement open science workflows, you need tools that help you document and share various aspects of your workflow, such as the details of the data collection or your code for data analysis.

While there are many tools that support open reproducible science, you will learn how to implement open science workflows in this course using: Shell, Git/Github.com, and Python using Jupyter Notebook.

Shell

Shell is the primary program that computers use to receive code (i.e. commands) and return information produced by executing these commands (i.e. output). These commands can be entered via a Terminal (i.e. Command Line Interface - CLI), which you will work with in this course.

Using Shell helps you to:

  • easily navigate your computer to access and manage files and folders (i.e. directories)
  • quickly and efficiently work with many files and directories at once
  • run programs that provide more functionality at the command line (e.g. the Git tool suite)
  • launch programs from specific directories on your computer (e.g. Jupyter Notebook for interactive programming)
  • use repeatable commands for these tasks across many different operating systems (Windows, Mac, Linux)

In this course, you will learn how to use Shell to access and manage files on your computer and to run other programs that can be started or run from the Terminal, such as Jupyter Notebook and Git.

Git and GitHub

Git is a useful tool that helps you track changes in files, a process called version control. GitHub is a cloud-based implementation of Git, which allows you to store and manage your files in the cloud.

Both tools work well together to support the sharing of files and collaborating in workflows. With Git, you can work on your files locally and then upload changes to the cloud version of your files on Github.com. You can also share your files with others, who can review the files and suggest changes.

In this course, you will learn how to use Git/GitHub workflows to implement version control for your files and to collaborate with others.

Python in Jupyter Notebook

Python is a widely used programming language in the sciences and provides strong functionality for working with a variety of data types and formats. Jupyter Notebook is a web-based tool that allows you to write and run Python interactively and to organize your code with outputs and documentations within Jupyter Notebook files.

Writing and organizing your Python code with Jupyter Notebook supports open reproducible science by facilitating and supporting collaboration and documentation.

In this course, you will learn how to use Jupyter Notebook to write and run Python code to organize, analyze, and visualize earth and environmental science data.

Leave a Comment