As an **undergraduate Mathematics** or **Data Science** teacher, you can use this set of computer-based tools to help you in teaching **Introductory Statistics** and specifically **Linear Regression**.

### Introduction

This lesson plan will help you to teach **Introductory Statistics for Data Science **through a** Linear Regression **assignment**. **The lesson plan includes** a hands-on computer-based classroom activity** to be conducted on a dataset of** Yearly Global Average CO2 Concentrations in parts per million**. This activity includes hands-on **Python code,** **a set of inquiry-based questions** that will enable your students to apply their understanding of **scatter plots, regression equations, correlation coefficients, and linear regression. **

Thus, the use of this lesson plan allows you to integrate the teaching of a climate science topic with a core topic in **Mathematics, Statistics and Data Science**.

### Questions

Use this lesson plan to help your students find answers to:

- Use an example to describe linear regression analysis.
- Use linear regression analyses to describe how global average CO2 concentrations have changed from 1980-2020 (last datapoint).
- Discuss reasons for changes in global average CO2 concentrations and their impact on Earth’s climate.

### About the Lesson Plan

Grade Level |
Undergraduate |

Discipline |
Mathematics, Data Science |

Topic(s) in Discipline |
Scatter Plots, Correlation Coefficients, Regression Equations, Linear Regression |

Climate Topic |
Climate and the Atmosphere Climate Variability Record |

Location |
Global |

Language(s) |
English |

Access |
Online, Offline |

Computer Skills Required |
Intermediate |

Approximate Time Required |
60-80 min |

### Contents

**Contents**

Teaching Module
(25 min) |
A teaching module to explain the basics of scatter plots, correlation coefficients, regression equations, and linear regression |

Video micro-lecture
(~14 min) |
A video micro-lecture to give Introduction to Simple Linear Regression |

Classroom/ Laboratory activity
(30 min) |
A classroom activity - Python Code to apply understanding of linear regression using a dataset of the Yearly Global Average CO2 Concentrations in parts per million (ppm) 1980-2020 |

### Video

Here is a step-by-step guide to using this lesson plan in the classroom/laboratory. We have suggested these steps as a possible plan of action. You may customize the lesson plan according to your preferences and requirements.

1 | Topic introduction and discussion | 1. Use the teaching module, ‘Introduction-Linear Regression and Correlation’ by OpenStax^{TM}, Rice University (for High School level) or ‘Chapter-3: Linear Regression’ provided by Ramesh Sridharan, Massachusetts Institute of Technology (for Undergraduate level), to introduce these topics of basic statistics.
2. Navigate to the sub-sections within the module to the basics of scatter plots, correlation coefficients, regression equations, and linear regression. 3. Use the in-built practice exercises and quizzes to evaluate your students’ understanding of the topics. |

2 | Develop the topic further | Use the video micro-lecture, ‘Introduction to Simple Linear Regression’ by dataminingincae, INCAE Business School for a basic introduction to Simple Linear Regression and terms like dependant variable, independent variable, regression line, regression coefficients. |

3 | Extend understanding by practicing Hands-on Python code | 1. Use the provided Dataset about Yearly Global Average CO2 Concentrations, file global-atm-co2.csv) and Python Notebook global-co2-concentration.ipynb.
2. The dataset includes monthly mean carbon dioxide globally averaged over marine surface sites for the span 1980-2020. Data Source: National Oceanic and Atmospheric Administration (NOAA) 3. Use the Python Notebook and Dataset to: - Read the Dataset using DataFrame
- Know the basics of the dataset like its dimensions, data types and memory usage
- Plot the scatter plot of yearly average_co2_concentrations variable
- Use NumPy library to convert the DataFrame to NumPy Array which would be used in the further steps.
- Find the Regression Coefficients for Simple Linear Regression
- Plot the scatter plot and Regression Line as per the predicted coefficients
- Calculate RMSE (Root Mean-Squared Error-values)
- Discuss how well the Regression Line describes the data points.
4. Encourage your students to answer topical questions by applying their understanding of scatter plots, correlation coefficients, regression equations and linear regression. 5. Use the regression analyses performed to initiate a discussion on the increase in global average CO2 concentrations from 1980 to 2020 due to anthropogenic activities causing CO2 emissions. |

Suggested questions/assignments for learning evaluation :

- Use the tools and the concepts learned so far to discuss and determine answers to the following questions:
- Use an example to describe linear regression analysis.
- Determine the difference in the confidence intervals for the slopes for two 30-year period datasets- 1850-1880 (beginning of industrial age) and 1987-2017 (last datapoint). What does the result suggest?
- Use linear regression analyses to describe how global temperatures have changed from 1850 (pre-industrial)- 2017 (last datapoint).
- Discuss reasons for global warming and its impact on Earth’s climate.

The tools in this lesson plan will enable students to:

- learn about linear regression and correlation
- understand linear regression equations and related terms such as correlation coefficients
- use linear regression analyses and confidence intervals of slopes of regression lines to describe global temperature anomalies from pre-industrial to recent times (1850-2017)
- discuss how these changes suggest that the planet has warmed significantly since the beginning of the industrial age

1 | Teaching Module, “Introduction- Linear Regression and Correlation” | Provided by OpenStax^{TM}, Rice University |

2 | Teaching Module, “Chapter 3: Linear Regression” | Provided by Ramesh Sridharan, MIT from ‘Statistics for Research Projects’ |

3 | Video micro-lecture, “Introduction to Simple Linear Regression" | Presented by dataminingincae, INCAE Business School |

4 | Dataset | Monthly mean carbon dioxide globally averaged over marine surface sites for the span 1980-2020; Data Source: National Oceanic and Atmospheric Administration (NOAA) |

### Python Notebook

#### You may also be interested in

- Lesson Plan: Teaching Integration using World Petroleum…
- Lesson Plan: Teach the Earth’s Climate System Through…
- Lesson Plan: Coding with Python: Modeling the Ice Albedo…
- Lesson Plan: Exploratory Data Analysis using India…
- Lesson Plan: Teaching Linear Regression using Global…
- Lesson Plan: Teaching “Climate Change and Food Security” and…
- Lesson Plan: Poverty and Climate: An Inextricable Link
- Lesson Plan: Atomic Number, Mass Number, Isotopes and…