9. Simple Linear Regression: Simple Linear Regression
Introduction to Regression
Regression analysis is a statistical technique that builds upon the principles of correlation. Just like correlation, regression also allows us to determine the relationship between variables.
#\phantom{0}#
Regression Analysis
Regression analysis is a statistical technique that uses sample data to create a mathematical model of the relationship between two or more variables in the population.
If regression analysis is used to model the relationship between only two variables, it is referred to as simple regression analysis.
If regression analysis is used to model the relationship between more than two variables, it is referred to as multiple regression analysis.
Regression and Prediction
Knowing the relationship between variables is particularly useful because it enables us to make predictions. After all, if there exists a stable relationship between variables, then we can use the value of one variable to make a prediction about the value of another variable.
Regression analysis is used in many fields of study and the terminology used to describe the variables varies quite a bit. In this course, the variable that is being predicted is labeled #Y# and will be referred to as the outcome variable. In other sources, the outcome variable might be called the dependent variable, response variable, or regressand.
A variable that we use to make a prediction about #Y# is labeled #X# and will be referred to as a predictor variable. Other commonly used terms for a predictor variable are the independent variable, explanatory variable, or regressor.
#\phantom{0}#
Although there are many different ways in which variables can be related to one another, by far the most commonly studied relationship is the linear relationship between variables.
#\phantom{0}#
Linear Regression
Linear regression is a statistical technique used to create a linear model of the relationship between variables.
A linear relationship between two variables #X# and #Y# is visually represented by a straight line and can be described mathematically with the following equation:
\[Y = b_0 + b_1X\]
The values #b_0# and #b_1# are called the coefficients of the equation:
- The coefficient #b_0# is called the intercept or constant and is the point at which the line crosses the vertical axis.
#\phantom{0}# - The coefficient #b_1# is called the slope and represents the number of units that #Y# increases as the result of a unit increase in #X#.
- If #b_1# is positive, the line is increasing.
- If #b_1# is negative, the line is decreasing.
- if #b_1# is #0#, the line is flat.
- If #b_1# is positive, the line is increasing.
Three examples of linear relationships:
\[\,\,\,\,\,\,\,\,\,\,\,Y=2+0.5X\] |
\[\,\,\,\,\,\,\,\,\,\,\,Y=4-0.2X\] |
\[\,\,\,\,\,\,\,\,\,\,\,Y=3\] |