2. Association and Correlation: Correlation
Strength of a Linear Relationship: Pearson Correlation Coefficient
To determine the direction and strength of the linear relationship between two variables, calculate the Pearson Correlation Coefficient.
#\phantom{0}#
Pearson Correlation Coefficient
Definition
The Pearson Correlation Coefficient is the standardized form of the covariance.
It is used to measure the direction and strength of the linear relationship between two quantitative variables.
The population and sample Pearson Correlation Coefficient are denoted by #\rho# and #r#, respectively.
Formula
\[\begin{array}{rcl}
\rho(X,Y)&=&\dfrac{\sigma_{X,Y}}{\sigma_X \sigma_Y}\\\\
r(X,Y)&=&\dfrac{s_{X,Y}}{s_Xs_Y}\\
\end{array}\]
Computation of the Pearson Correlation with Statistical Software
To compute the sample Pearson Correlation Coefficient between two variables #X# and #Y# in Excel, make use of the following function:
CORREL(x, y)
- x: The numeric vector that contains the values for variable #X#
- y: The numeric vector that contains the values for variable #Y#
To compute the sample Pearson Correlation Coefficient between two variables #X# and #Y# in R, make use of the following function:
cor(x, y)
- x: The numeric vector that contains the values for variable #X#
- y: The numeric vector that contains the values for variable #Y#
#\phantom{0}#
The Pearson Correlation Coefficient always takes on a value between #-1# and #+1#:
- A value of #+1# indicates a perfect positive linear relationship between two variables.
- A value of #-1# indicates a perfect negative linear relationship between two variables.
- A value of #0# indicates the variables are linearly unrelated.
It is important to remember that the Pearson Correlation Coefficient only measures the linear relationship between two variables. Consequently, finding a Pearson coefficient of #0# does not necessarily mean the two variables are completely unrelated, it simply indicates that there is no linear relationship.
The scatterplot below shows an example of two variables that have a Pearson Correlation Coefficient of #0#, but do have a perfect quadratic relationship.
Another thing to watch out for is that the Pearson Correlation Coefficient is very sensitive to outliers. A single outlier can have drastic effects on the magnitude of the coefficient.