GEOG 4463 & 5463 - Earth Analytics Bootcamp: Homework 4


Homework 4

For this assignment, you will create a Jupyter Notebook with your answers to the questions below, and submit this Jupyter Notebook to a Github repository for Homework 4 following the instructions below Part II: Submit Your Jupyter Notebook to GitHub.

You need to complete this assignment (Homework 4) by Tuesday, August 21st at 8:00 AM (U.S. Mountain Daylight Time). See this link to convert the due date/time to your local time.

This assignment will primarily test your skills to notify others of pull requests on Github.com from Day 9 and to write functions from Day 10. Your skills will loops from Day 7 and conditional statements from Day 8 will also be reviewed.

You will be asked to work with familiar data: temperature and precipitation for various months and years of data for Boulder, Colorado, provided by the U.S. National Oceanic and Atmospheric Administration (NOAA).

What You Need

Be sure that you have completed all of the lessons from Days 7 to 10 for the Earth Analytics Bootcamp. Completing the challenges at the end of the lessons will also help you with this assignment. Review the lessons as needed to answer the questions.

You will need to fork and clone a Github repository for Homework 4 from https://github.com/earthlab-education/ea-bootcamp-hw-4-yourusername. You will receive an invitation to the Github repository for Homework 4 via CANVAS.

Note: the repository will be empty, as you will add a new Jupyter Notebook containing your answers to the questions below.

Part I: Create and Modify a Jupyter Notebook

Begin by creating a new Jupyter Notebook in your forked repository from https://github.com/yourusername/ea-bootcamp-hw-4-yourusername.

Rename the file to firstinitial-lastname-ea-bootcamp-hw-4.ipynb (e.g. jpalomino-ea-bootcamp-hw-4.ipynb).

Note that Git will recognize this new Jupyter Notebook as a new file that can be added, committed, and pushed back to your forked repository on Github.com.

Be Sure to Add Documentation to Your Notebook (4 pts)

Start with Markdown cell containing a Markdown title for this assignment, plus an author name and date in list form. Bold the words for author and date, but do not bold your name and today’s date.

Add a Markdown cell before each code cell you create to describe the purpose of your code (e.g. what are you accomplishing by executing this code?). Think carefully about how many cells you should have to best organize your data (hint: review lessons for examples of how code can be grouped into cells).

Within code cells, be sure to also add Python comments to document each code block and use the PEP 8 guidelines to assign appropriate variable names that are short and concise but also clearly indicate the kind of data contained in the variable.

Be Sure to Document Functions and Assign Clear Function Names (8 pts)

In this assignment, you will be asked to write several functions.

Be sure to add documentation within your functions using Python comments to tell the user what the function is doing and and what inputs it can take.

Also, be sure to use clear function names that tell the user what the function does. If you find it useful, you can review the Earth Analytics Bootcamp reference page on PEP8 Style Guide.

Question 1: Import Python Packages (2 pts)

In the questions below, you will be working with numpy arrays.

You will also be downloading files using urllib.request and accessing directories and files on your computer using os. Last, you will also be creating plots of your data.

Import all of the necessary Python packages to accomplish these tasks.

Question 2: Download CSV Files and Import Into Numpy Arrays (2 pts)

Use .urllib.request to download the following .csv files of monthly precipitation (millimeters) and temperature (Celsius) - notice the units this time:

  1. monthly-precip-mm-1998-to-2017.csv from https://ndownloader.figshare.com/files/12799931

  2. monthly-temp-cel-1998-to-2017.csv from https://ndownloader.figshare.com/files/12799928

Each dataset contains a row for each year specified in the dataset name and contains a column for each month (starting with January through December).

Use the appropriate function to import the files into new numpy arrays.

Print your numpy arrays using the appropriate code to suppress the scientific notation.

Note: you are not required to write a loop to accomplish this task. You can follow the same process that you have before to download and import files.

[[ 27.178   5.842  86.614 115.824  46.228  46.99  102.108  24.638  16.764
   28.448  38.862  26.67 ]
 [ 16.51    2.032  27.686 191.77   46.736  20.828  64.516 140.716  66.548
   33.782  20.574  25.654]
 [  7.366  13.97   65.024  38.1    40.64   38.862  53.086  18.288  63.754
   32.512  22.606  11.176]
 [ 18.542  21.844  51.054  76.708  91.948  27.686  44.704  41.656  44.958
   10.16   25.908   9.144]
 [ 27.178  11.176  38.1     5.08   81.28   29.972   2.286  36.576  38.608
   61.976  19.812   0.508]
 [  2.286  38.608 138.176  75.946  66.548  68.326  18.034  89.408   8.89
   11.43   20.32   21.336]
 [ 20.828  33.274  27.686 143.764  32.512 100.584  87.376  73.152  52.578
   58.928  50.546   8.89 ]
 [ 35.56    7.874  30.988  98.044  48.514  68.072  10.668  41.402  13.208
   71.12    8.636  10.922]
 [ 11.176  17.272  52.832  26.416  28.956  33.528  66.802  31.242  31.75
   94.234  18.796  77.47 ]
 [ 42.672  21.844  42.926  56.896  45.466   9.652  20.32   46.228  48.768
   35.052  11.938  53.34 ]
 [ 11.684  16.002  37.338  28.702 106.934  40.132   2.286  75.438  46.736
   29.972   3.302  33.782]
 [ 15.748   6.858  48.006 149.352  78.232  68.58   36.068   8.382  10.668
   82.804  23.622  35.306]
 [  7.112  34.798  83.82   92.202  68.834  85.344  58.674  27.178   6.35
   24.13   15.494  12.192]
 [ 24.384  25.908   8.382  61.214 131.064  34.29   72.898  27.432  65.024
   41.91   24.892  48.768]
 [  9.652  49.276   0.254  33.274  45.212   9.652 126.746   9.144  57.658
   36.576   7.112  12.954]
 [  6.858  28.702  43.688 105.156  67.564  15.494  26.162  35.56  461.264
   56.896   7.366  12.7  ]
 [ 42.418  17.272  41.148  47.498 112.522  21.336 116.078  40.64   73.152
   29.464  22.352  34.798]
 [  9.652  93.726   9.652 114.3   198.628  44.704  75.692   7.874   3.556
   51.308  46.482  28.194]
 [  9.398  36.576  97.536  84.836  51.054  60.198  15.494  26.924  11.43
    9.652  11.938  23.114]
 [ 35.814  18.542  36.83   80.01  159.766  11.43   33.02   41.148  48.768
   61.468  14.478  17.272]]

[[ 2.5    2.444  3.722  8.056 14.889 16.722 22.667 21.5   19.5   10.222
   6.667  0.111]
 [ 2.444  5.611  7.778  6.944 13.111 18.222 23.056 20.722 14.722 11.056
   8.889  2.722]
 [ 2.444  5.     6.056 10.667 16.111 19.667 23.722 22.778 17.278  9.778
  -0.333 -0.444]
 [ 0.556  0.167  4.889 10.389 14.667 20.5   23.889 22.111 18.333 12.111
   6.611  1.667]
 [ 0.611  2.222  3.056 11.611 13.444 21.389 24.889 21.833 17.778  7.722
   4.611  2.556]
 [ 4.556 -0.222  6.5   10.333 14.111 17.111 24.389 22.722 15.833 14.111
   3.889  2.444]
 [ 1.889  0.889  9.     9.556 15.5   17.056 20.667 19.111 17.167 11.056
   4.278  2.5  ]
 [ 1.944  3.278  5.556  9.111 14.222 18.556 23.944 20.944 19.    11.444
   7.222  0.722]
 [ 4.833  0.944  4.111 12.167 16.111 22.    23.556 22.    14.667 10.556
   6.333  1.833]
 [-2.667  1.444  8.667  8.833 14.444 19.833 23.778 23.167 18.056 12.889
   7.167 -1.   ]
 [-0.222  2.278  4.889  8.778 13.944 18.944 23.889 20.889 16.056  7.778
   7.778 -0.5  ]
 [ 3.444  4.278  6.833  8.5   15.167 17.222 20.889 20.889 17.278  6.944
   6.556 -2.944]
 [ 0.556 -1.056  5.944  9.333 12.167 19.389 22.5   22.444 19.222 12.778
   4.333  2.889]
 [ 0.611  0.     7.333  9.389 12.056 19.778 23.056 23.944 17.444 11.611
   5.722  0.111]
 [ 3.833  0.167 10.444 12.389 15.667 23.444 23.778 22.889 18.889 10.444
   7.833  0.944]
 [ 0.556  0.056  4.722  6.556 14.278 21.056 22.333 22.333 18.389  8.778
   6.222 -0.278]
 [ 1.444  0.     6.444  9.889 13.667 19.    22.333 20.667 17.667 12.944
   3.5    1.056]
 [ 2.5    2.556  7.833 10.056 11.333 20.167 21.278 21.389 20.778 13.444
   4.889  0.611]
 [ 1.167  4.944  6.111  9.444 12.278 21.389 23.333 21.333 18.278 14.889
   8.611  0.   ]
 [ 0.111  5.722 10.167  9.389 13.167 20.389 23.278 20.556 17.5   10.833
   8.611  2.444]]

Question 3: Write Loop to Check Dimensions of Numpy Arrays (4 pts)

Write a loop to print the following information for each numpy array from the previous question:

  1. number of rows and columns
  2. number of dimensions (as a single value)

Hints:

  • You have a fixed number of items that you want upon which you want iterate the same code (i.e. the two numpy arrays you previously created).
  • There are two attributes of numpy arrays that you can use to get this information.
shape: (20, 12)
number of dimensions: 2

shape: (20, 12)
number of dimensions: 2

Question 4: Write Function to Convert Units (6 pts)

Imagine you work for NOAA, which you know publishes their precipitation data in inches, rather than millimeters. You have received new data in millimeters, so you need to convert these values to match your other NOAA datasets.

Write a function to convert the units of a numpy array from millimeters to inches (one inch = 25.4 millimeters). (Note you will execute it in the next question.)

Be sure to add documentation within your functions using Python comments to tell the user what the function is doing and and what inputs it can take.

Question 5: Apply Function and Save Output to New Numpy Array (4 pts)

Run your function on the numpy array created from monthly_precip_mm_1998-to-2017.csv to convert the units from millimeters to inches.

Save the output as a new numpy array and print the numpy array.

array([[ 1.07,  0.23,  3.41,  4.56,  1.82,  1.85,  4.02,  0.97,  0.66,
         1.12,  1.53,  1.05],
       [ 0.65,  0.08,  1.09,  7.55,  1.84,  0.82,  2.54,  5.54,  2.62,
         1.33,  0.81,  1.01],
       [ 0.29,  0.55,  2.56,  1.5 ,  1.6 ,  1.53,  2.09,  0.72,  2.51,
         1.28,  0.89,  0.44],
       [ 0.73,  0.86,  2.01,  3.02,  3.62,  1.09,  1.76,  1.64,  1.77,
         0.4 ,  1.02,  0.36],
       [ 1.07,  0.44,  1.5 ,  0.2 ,  3.2 ,  1.18,  0.09,  1.44,  1.52,
         2.44,  0.78,  0.02],
       [ 0.09,  1.52,  5.44,  2.99,  2.62,  2.69,  0.71,  3.52,  0.35,
         0.45,  0.8 ,  0.84],
       [ 0.82,  1.31,  1.09,  5.66,  1.28,  3.96,  3.44,  2.88,  2.07,
         2.32,  1.99,  0.35],
       [ 1.4 ,  0.31,  1.22,  3.86,  1.91,  2.68,  0.42,  1.63,  0.52,
         2.8 ,  0.34,  0.43],
       [ 0.44,  0.68,  2.08,  1.04,  1.14,  1.32,  2.63,  1.23,  1.25,
         3.71,  0.74,  3.05],
       [ 1.68,  0.86,  1.69,  2.24,  1.79,  0.38,  0.8 ,  1.82,  1.92,
         1.38,  0.47,  2.1 ],
       [ 0.46,  0.63,  1.47,  1.13,  4.21,  1.58,  0.09,  2.97,  1.84,
         1.18,  0.13,  1.33],
       [ 0.62,  0.27,  1.89,  5.88,  3.08,  2.7 ,  1.42,  0.33,  0.42,
         3.26,  0.93,  1.39],
       [ 0.28,  1.37,  3.3 ,  3.63,  2.71,  3.36,  2.31,  1.07,  0.25,
         0.95,  0.61,  0.48],
       [ 0.96,  1.02,  0.33,  2.41,  5.16,  1.35,  2.87,  1.08,  2.56,
         1.65,  0.98,  1.92],
       [ 0.38,  1.94,  0.01,  1.31,  1.78,  0.38,  4.99,  0.36,  2.27,
         1.44,  0.28,  0.51],
       [ 0.27,  1.13,  1.72,  4.14,  2.66,  0.61,  1.03,  1.4 , 18.16,
         2.24,  0.29,  0.5 ],
       [ 1.67,  0.68,  1.62,  1.87,  4.43,  0.84,  4.57,  1.6 ,  2.88,
         1.16,  0.88,  1.37],
       [ 0.38,  3.69,  0.38,  4.5 ,  7.82,  1.76,  2.98,  0.31,  0.14,
         2.02,  1.83,  1.11],
       [ 0.37,  1.44,  3.84,  3.34,  2.01,  2.37,  0.61,  1.06,  0.45,
         0.38,  0.47,  0.91],
       [ 1.41,  0.73,  1.45,  3.15,  6.29,  0.45,  1.3 ,  1.62,  1.92,
         2.42,  0.57,  0.68]])

Question 6: Write Function to Convert Units (8 pts)

Imagine you work for NOAA, which you know publishes their temperature data in Fahrenheit, rather than Celsius. You have received new data in Celsius, so you need to convert these values to match your other NOAA datasets.

Write a function to convert the units of a numpy array from Celsius to Fahrenheit (Fahrenheit = (Celsius * 1.8) + 32). (Note you will execute it in the next question.)

Be sure to add documentation within your functions using Python comments to tell the user what the function is doing and and what inputs it can take.

Question 7: Apply Function and Save Output to New Numpy Array (4 pts)

Run your function on the numpy array created from monthly_temp_cel_1998-to-2017.csv to convert the units from Celsius to Fahrenheit.

Save the output as a new numpy array and print the numpy array.

array([[36.5   , 36.3992, 38.6996, 46.5008, 58.8002, 62.0996, 72.8006,
        70.7   , 67.1   , 50.3996, 44.0006, 32.1998],
       [36.3992, 42.0998, 46.0004, 44.4992, 55.5998, 64.7996, 73.5008,
        69.2996, 58.4996, 51.9008, 48.0002, 36.8996],
       [36.3992, 41.    , 42.9008, 51.2006, 60.9998, 67.4006, 74.6996,
        73.0004, 63.1004, 49.6004, 31.4006, 31.2008],
       [33.0008, 32.3006, 40.8002, 50.7002, 58.4006, 68.9   , 75.0002,
        71.7998, 64.9994, 53.7998, 43.8998, 35.0006],
       [33.0998, 35.9996, 37.5008, 52.8998, 56.1992, 70.5002, 76.8002,
        71.2994, 64.0004, 45.8996, 40.2998, 36.6008],
       [40.2008, 31.6004, 43.7   , 50.5994, 57.3998, 62.7998, 75.9002,
        72.8996, 60.4994, 57.3998, 39.0002, 36.3992],
       [35.4002, 33.6002, 48.2   , 49.2008, 59.9   , 62.7008, 69.2006,
        66.3998, 62.9006, 51.9008, 39.7004, 36.5   ],
       [35.4992, 37.9004, 42.0008, 48.3998, 57.5996, 65.4008, 75.0992,
        69.6992, 66.2   , 52.5992, 44.9996, 33.2996],
       [40.6994, 33.6992, 39.3998, 53.9006, 60.9998, 71.6   , 74.4008,
        71.6   , 58.4006, 51.0008, 43.3994, 35.2994],
       [27.1994, 34.5992, 47.6006, 47.8994, 57.9992, 67.6994, 74.8004,
        73.7006, 64.5008, 55.2002, 44.9006, 30.2   ],
       [31.6004, 36.1004, 40.8002, 47.8004, 57.0992, 66.0992, 75.0002,
        69.6002, 60.9008, 46.0004, 46.0004, 31.1   ],
       [38.1992, 39.7004, 44.2994, 47.3   , 59.3006, 62.9996, 69.6002,
        69.6002, 63.1004, 44.4992, 43.8008, 26.7008],
       [33.0008, 30.0992, 42.6992, 48.7994, 53.9006, 66.9002, 72.5   ,
        72.3992, 66.5996, 55.0004, 39.7994, 37.2002],
       [33.0998, 32.    , 45.1994, 48.9002, 53.7008, 67.6004, 73.5008,
        75.0992, 63.3992, 52.8998, 42.2996, 32.1998],
       [38.8994, 32.3006, 50.7992, 54.3002, 60.2006, 74.1992, 74.8004,
        73.2002, 66.0002, 50.7992, 46.0994, 33.6992],
       [33.0008, 32.1008, 40.4996, 43.8008, 57.7004, 69.9008, 72.1994,
        72.1994, 65.1002, 47.8004, 43.1996, 31.4996],
       [34.5992, 32.    , 43.5992, 49.8002, 56.6006, 66.2   , 72.1994,
        69.2006, 63.8006, 55.2992, 38.3   , 33.9008],
       [36.5   , 36.6008, 46.0994, 50.1008, 52.3994, 68.3006, 70.3004,
        70.5002, 69.4004, 56.1992, 40.8002, 33.0998],
       [34.1006, 40.8992, 42.9998, 48.9992, 54.1004, 70.5002, 73.9994,
        70.3994, 64.9004, 58.8002, 47.4998, 32.    ],
       [32.1998, 42.2996, 50.3006, 48.9002, 55.7006, 68.7002, 73.9004,
        69.0008, 63.5   , 51.4994, 47.4998, 36.3992]])

Question 8: Write Function to Calculate Summary Statistics Using Axes (8 pts)

Write a function that calculates the sum across the rows of a numpy array.

Hints:

  • Recall which existing numpy function you can use to calculate a sum. You will include this function within the function you write to answer this question.
  • Review the lessons on functions to see the use of axes to calculate a statistic across the rows or columns of a numpy array.

Question 9: Apply Function and Save Output to New Numpy Array (4 pts)

Run your function on the numpy array containing the precipitation values in inches.

Save the output as a new numpy array and print the numpy array.

array([22.29, 25.88, 15.96, 18.28, 13.88, 22.02, 27.17, 17.52, 19.31,
       17.13, 17.02, 22.19, 20.32, 22.29, 15.65, 34.15, 23.57, 26.92,
       17.25, 21.99])

Question 10: Discuss Your Function and Output (6 pts)

  1. What are the dimensions of the numpy array you created for precipitation in Question 9? How many values does this numpy array contain? How do you know?

  2. Why does the numpy array you created for precipitation in Question 9 contain the number of values that it contains? What does each value in this numpy array represent?

Question 11: Write Function to Calculate Summary Statistics Using Axes (8 pts)

Write a function that calculates the median across the columns of a numpy array.

Hints:

  • Recall which existing numpy function you can use to calculate a median. You will include this function within the function you write to answer this question.
  • Review the lessons on functions to see the use of axes to calculate a statistic across the rows or columns of a numpy array.

Question 12: Apply Function and Save Output to New Numpy Array (4 pts)

Run your function on the numpy array containing the temperature values in Fahrenheit.

Save the output as a new numpy array and print the numpy array.

array([34.9997, 35.2994, 43.2995, 48.9497, 57.4997, 67.5005, 73.9499,
       70.9997, 63.9005, 51.9008, 43.6001, 33.4994])

Question 13: Discuss Your Function and Output (6 pts)

  1. What are the dimensions of the numpy array you created for temperature in Question 12? How many values does this numpy array contain? How do you know?

  2. Why does the numpy array you created for temperature in Question 12 contain the number of values that it contains? What does each value in this numpy array represent?

Question 14: Create Manual Numpy Array (2 pts)

Manually create a numpy array that contains the month names for January to December. Print your numpy array using print() to only see the values it contains.

Hints:

  • Review the practice with data structures activity to review how to create a numpy array manually.
  • You will want to use the quotes "text" to create the month names
['January' 'February' 'March' 'April' 'May' 'June' 'July' 'August'
 'September' 'October' 'November' 'December']

Question 15: Write Conditional Statement to Check Dimensions (4 pts)

Write a conditional statement to check that the dimensions are the same between the numpy array created for temperature in Question 12 and your numpy array containing the month names created in Question 14.

Using your conditional statement, print a message stating whether or not the dimensions are the same, so that you can plot these numpy arrays together.

Hint:

  • Compare the shape of the arrays, rather than the single value for the dimension.
  • Recall the operator to check equality between two values.
These arrays have the same shape and can be plotted together.

Question 16: Plot Your Numpy Arrays (4 pts)

Create a plot of your choosing to display your data from the numpy array created for temperature in Question 12 and your numpy array containing the month names created in Question 14.

Be sure to add a title and label the axes with the appropriate units.

For the title, think carefully about what this data is actually showing. Review your code that created this temperature array, so that you can title the plot appropriately.

png

Question 17: Practice Pseudo Coding (10 pts)

  1. Without using too much jargon, make a list of all of the steps that you had to complete prior to creating the plot for Question 16. (FYI - this is pseudo-coding! Identifying the steps needed in order to write and automate your code.)
    • Begin your list with the conversion of the temperature data from Celsius to Fahrenheit and end it with checking that the dimensions between the numpy arrays were the same, so you could plot.
  2. How could use functions to more easily accomplish the goal of creating the final plot? (Note: describe in words. You are not being asked to create a function to do this.)

  3. Identifying what you do not already know is also very important in pseudo-coding a workflow.
    • With this in mind, do you think it would be possible to create one function that accomplishes all of the steps listed in subquestion 1?
    • What would you need to know how to do, in order to write one function that completed all of the steps?

Tag the Instructor in Your Pull Request (2 pts)

In your pull request message to submit this homework, include @jlpalomino in your message for the Pull Request to notify the instructor of your submission.

Remember that you can edit a message for the pull request, if you forget to include it the first time. The message will be updated when you save the changes.

Part II: Submit Your Jupyter Notebook to GitHub

To submit your Jupyter Notebook for Homework 4, follow the Git/Github workflow from:

  1. Guided Activity on Version Control with Git/GitHub to add, commit, and push your Jupyter Notebook for Homework 4 to your forked repository for Homework 4 (https://github.com/yourusername/ea-bootcamp-hw-4-yourusername).

  2. Guided Activity to Submit Pull Request to submit a pull request of your Jupyter Notebook for Homework 4 to the Earth Lab repository for Homework 4 (https://github.com/earthlab-education/ea-bootcamp-hw-4-yourusername).

  3. This time include @jlpalomino in your message for the Pull Request to notify the instructor of your submission.