2. Association and Correlation: Correlation
Measuring the Relationship Between Two Variables
The relationship between two variables #X# and #Y# is characterized by three aspects:
- The direction of the relationship
- The form of the relationship
- The strength of the relationship
Direction
The direction of the relationship between two variables can either be positive or negative.
When two variables are positively related they have a tendency to move in the same direction. This means that as the value of #X# increases from one individual to the next, the value of #Y# has a tendency to increase as well. Similarly, as #X# decreases so does #Y#.
When two variables are negatively related they have a tendency to move in the opposite direction. This means that as the value of #X# increases from one individual to the next, the value of #Y# has a tendency to decrease. The opposite also holds true; as #X# decreases, #Y# increases.
#\phantom{0}#
Form
The form of a relationship describes the general trend or pattern present in the data. The most commonly examined types of relationship in statistics are the linear relationship and the monotonic relationship.
A linear relationship is graphically represented as a straight line. If two variables #X# and #Y# are linearly related, it means that as #X# changes, #Y# changes by the same percentage.
#\phantom{0}#
#\phantom{0}#
Just like a linear relationship, a monotonic relationship can either have a positive or negative direction:
- If two variables have a positive monotonic relationship, this means that as #X# increases, #Y# either increases as well or remains constant, but never decreases. A positive linear relationship is thus also an example of a positive monotonic relationship.
- If two variables have a negative monotonic relationship, this means that as #X# increases, #Y# either decreases or remains constant, but never increases.
#\phantom{0}#
#\phantom{0}#
Strength
The strength of the relationship between two variables is measured by the magnitude of the correlation coefficient. A correlation coefficient always takes on a value between #-1# and #+1#:
- A coefficient of #-1# indicates a perfect negative relationship
- A coefficient of #+1# indicates a perfect positive relationship
- A coefficient of #0# means that the two variables are unrelated
It is important to note that every correlation coefficient measures a specific type of relationship. This means that finding a coefficient of #0# does not automatically mean there is absolutely no relationship between the two variables. Rather, it means that there is no indication of the specific relationship that is being tested for.
When looking at a scatter plot, the strength of the relationship corresponds to how closely the data points follow the predicted pattern. Below are #6# scatter plots that display an increasingly strong positive linear relationship. The dotted line in each graph represents a perfect positive linear relationship.
#\phantom{0}#
#\phantom{00}##\phantom{0}##\phantom{0}#
#\phantom{00}##\phantom{0}##\phantom{0}#