1. Descriptive Statistics: Measures of Variability
Range, Interquartile Range, and the Five-Number Summary
The most basic measure of variability is the range.
Range
Definition
The range is the difference between the highest and the lowest score of a distribution.
Formula
\[\text{range} = X_{max} - X_{min} \]
#\phantom{0}#
Since the range measures variability by looking at the end-points of a distribution, it is extremely sensitive to the presence of outliers in the dataset. An alternative measure of variability that is much less sensitive to outliers is the interquartile range.
#\phantom{0}#
Interquartile range
Definition
The interquartile range (IQR) is the difference between the first quartile and the third quartile of a distribution.
Formula
\[\text{IQR} = Q_3 - Q_1\]
#\phantom{0}#
Remember that quartiles are measures of location which divide a distribution into four equal parts, similar to how the median divides a distribution into two equal parts. The interquartile range thus measures how spread out the middle 50% of the data is. This means that the interquartile range is completely unaffected by the values of the smallest and largest 25% of the scores.
Often, the range and interquartile range are combined with the median to form a so-called five-number summary.
#\phantom{0}#
Five-number summary
Definition
The five-number summary is made up out of the following five points of a distribution:
- Minimum
- First quartile
- Median
- Third quartile
- Maximum
To plot the five-number summary, construct a boxplot.
#\phantom{0000000000}#Boxplot