9. Simple Linear Regression: Simple Linear Regression
Simple Linear Regression
The most basic form of regression analysis is simple linear regression.
#\phantom{0}#
Simple Linear Regression
Simple linear regression is a statistical technique used to model the linear relationship between two variables.
It does this by finding the best-fitting straight line through a set of data points.
#\phantom{0}#
#\phantom{0}#
This best-fitting line is called the regression line and is mathematically described by the following regression equation:
\[\hat{Y} = b_0 + b_1X\]
where #\hat{Y}# is the predicted value of #Y#.
Usefulness of the Regression Equation
Finding the values of the coefficients of a regression equation serves a dual purpose:
- It helps us understand the relationship between the two variables. The slope of the equation, for instance, allows us to determine the direction and strength of the linear relationship between #X# and #Y#.
- It allows us to make predictions about #Y# on the basis of #X#. Once we have found the regression equation, we can simply plug in a value for #X# to make a prediction about the value of #Y#.
For #10# days, the owner of an ice cream truck kept track of how much ice cream he sold and what the maximum temperature in #^\circ{}C# was that day. He then performed a simple linear regression to construct a regression line in the hopes of finding the relationship between the maximum temperature and the amount of ice cream sold.
Take a look at the scatterplot below. The blue dots represent the #10# #\blue{\textbf{data points}}# that serve as the basis for the regression analysis. The #\orange{\textbf{regression line}}# #\hat{Y} =-20.45 + 2.93X # is drawn in orange.
#\phantom{0}#
#\phantom{0}#
The slope #b_1# is #2.93#. This value predicts how much more ice cream #Y# will be sold, given that the maximum temperature #X# increases by #1#. For example, if the maximum temperature increases by #2#, the amount of ice cream sold is predicted to increase by #2\cdot 2.93=5.86#.
The intercept #b_0# is #-20.45#. In this case, the negative value of the intercept holds no particular meaning, since it not possible to sell a negative amount of ice cream.
To calculate the predicted amount of ice cream sold at a particular maximum temperature, we simply enter a value for #X# into the equation. For example, at a maximum temperature of #X=25#, the predicted amount of ice cream sold is:
\[\hat{Y}=-20.45 + 2.93X=-20.45 + 2.93\cdot25=52.8\]