1. Descriptive Statistics: Measures of Variability
Interquartile Range Rule for Identifying Outliers
A common method for identifying outliers is the Interquartile Range Rule.
#\phantom{0}#
Interquartile Range Rule
According to the Interquartile Range Rule, a score #X# is considered an outlier if:
- The score lies more than #1.5\cdot IQR\,# below the first quartile: #X < (Q_1 - 1.5\cdot IQR)#
- The score lies more than #1.5\cdot IQR\,# above the third quartile: #X > (Q_3 + 1.5\cdot IQR)#
\[53\,\,\,\,\,\,53\,\,\,\,\,\,60\,\,\,\,\,\,93\,\,\,\,\,\,106\,\,\,\,\,\,108\,\,\,\,\,\,76\,\,\,\,\,\,74\,\,\,\,\,\,78\,\,\,\,\,\,69\,\,\,\,\,\,72\,\,\,\,\,\,80\,\,\,\,\,\,67\,\,\,\,\,\,\]
Based on the Interquartile Range Rule, how many outliers are there in the sample?
To calculate the interquartile range, first sort the values in ascending order:
\[53\,\,\,\,\,\,53\,\,\,\,\,\,60\,\,\,\,\,\,67\,\,\,\,\,\,69\,\,\,\,\,\,72\,\,\,\,\,\,74\,\,\,\,\,\,76\,\,\,\,\,\,78\,\,\,\,\,\,80\,\,\,\,\,\,93\,\,\,\,\,\,106\,\,\,\,\,\,108\,\,\,\,\,\,\]
Next, calculate the first quartile. To find the index #i_1# of the first quartile (#Q=1#), use the following formula:
\[\begin{array}{rcl}
i_1 &=& \cfrac{Q}{4}(n-1)+1\\
&=& \cfrac{1}{4}(13 - 1) + 1=4
\end{array}\]
Since #i_1=4# is an integer, the first quartile is the score located at the #4^{th}# position of the ordered data:
\[X_{4} = 67\]
Next, calculate the third quartile. To find the index #i_3# of the third quartile (#Q=3#), use the following formula:
\[\begin{array}{rcl}
i_3 &=& \cfrac{Q}{4}(n-1)+1\\
&=& \cfrac{3}{4}(13 - 1) + 1=10
\end{array}\]
Since #i_3=10# is an integer, the third quartile is the score located at the #10^{th}# position of the ordered data:
\[X_{10} = 80\]
Calculate the interquartile range:
\[\text{IQR}=Q_3-Q_1=80-67=13\]
According to the Interquartile Range Rule, a score #X# is considered an outlier if:
- The score lies more than #1.5\cdot IQR\,# below the first quartile: #X < (Q_1 - 1.5\cdot IQR)#
\[Q_1 - 1.5\cdot IQR = 67 - 1.5 \cdot 13 = 47.5\] - The score lies more than #1.5\cdot IQR\,# above the third quartile: #X > (Q_3 + 1.5\cdot IQR)#
\[Q_3 + 1.5\cdot IQR = 80 + 1.5 \cdot 13 = 99.5\]
This means that any score #X<47.5# or #X>99.5# should be considered an outlier, of which there are #2# in the sample, namely: #108# and #106#.