7. Hypothesis Testing: One-sample t-test
Confidence Interval for μ when σ is Unknown
Thus far, we have used the following formula to compute a #C\%\,CI# for a population mean #\mu#:
\[CI_{\mu}=\bigg(\bar{X} - z^*\cdot \cfrac{\sigma}{\sqrt{n}},\,\,\,\, \bar{X} + z^*\cdot \cfrac{\sigma}{\sqrt{n}} \bigg)\]
Just like the calculation of the #Z#-statistic, this calculation requires the population standard deviation #\sigma# to be known.
When #\sigma# is unknown, we will have to rely on the #t#-distribution and the sample standard deviation #s# to compute the confidence interval.
#\phantom{0}#
Confidence Interval for a Population Mean when σ is Unknown
Assuming the sampling distribution of the sample mean is (approximately) normal, the general formula for computing a #C\%# CI for a population mean #\mu#, based on a random sample of size #n#, when σ is unknown, is:
\[CI_{\mu}=\bigg(\bar{X} - t^*\cdot \cfrac{s}{\sqrt{n}},\,\,\,\, \bar{X} + t^*\cdot \cfrac{s}{\sqrt{n}} \bigg)\]
Where #t^*# is the critical value of the #t_{n-1}# distribution such that #\mathbb{P}(-t^* \leq t \leq t^*)=\frac{C}{100}#.
Calculating t* with Statistical Software
Let #C# be the confidence level in #\%#.
To calculate the critical value #t^*# in Excel, make use of the function T.INV():
\[=\text{T.INV}((100+C)/200, n \text{ - } 1)\]
To calculate the critical value #t^*# in R, make use of the function qt():
\[\text{qt}(p=(100+C)/200, df=n \text{ - } 1,lower.tail = \text{TRUE})\]
He collects a random sample of #92# students. On average, these students spent #9.0# hours a week on their homework with a standard deviation of #s=1.4# hours.
Construct a #91\%# confidence interval for the population mean #\mu#. Round your answers to #3# decimal places.
There are a number of different ways we can compute the confidence interval. Click on one of the panels to toggle a specific solution.
Since the population standard deviation #\sigma# is unknown, we will have to use the #t#-distribution and the sample standard deviation #s# to construct the confidence interval.
A sample size of #n=92# is considered large enough for the Central Limit Theorem to apply.
This means that, although the sample in question comes from a population having an unknown distribution, the sampling distribution of the sample mean is approximately normal.
Assuming the sampling distribution of the sample mean is (approximately) normal, the general formula for computing a #C\%\, CI# for the population mean #\mu#, based on a random sample of size #n#, is:
\[CI_{\mu}=\bigg(\bar{X} - t^*\cdot \cfrac{s}{\sqrt{n}},\,\,\,\, \bar{X} + t^*\cdot \cfrac{s}{\sqrt{n}} \bigg)\]
For a given confidence level #C# (in #\%#), the critical value #t^*# of the #t_{n-1}# is the value such that #\mathbb{P}(-t^* \leq t \leq t^*)=\cfrac{C}{100}#.
To calculate this critical value #t^*# in Excel, make use of the following function:
T.INV(probability, deg_freedom)
- probability: A probability corresponding to the normal distribution.
- deg_freedom: The mean of the distribution.
Here, we have #C=91#. Thus, to calculate #t^*# such that #\mathbb{P}(-t^* \leq t \leq t^*)=0.91#, run the following command:
\[\begin{array}{c}
=\text{T.INV}((100+C)/200, n - 1)\\
\downarrow\\
=\text{T.INV}(191/200, 92 \text{ - } 1)
\end{array}\]
This gives:
\[t^* = 1.71364\]
Calculate the lower bound #L# of the confidence interval:
\[L = \bar{X} - t^* \cdot \cfrac{s}{\sqrt{n}} = 9.0 - 1.71364 \cdot \cfrac{1.4}{\sqrt{92}}=8.750\]
Calculate the lower bound #U# of the confidence interval:
\[U = \bar{X} + t^* \cdot \cfrac{s}{\sqrt{n}} = 9.0 + 1.71364 \cdot \cfrac{1.4}{\sqrt{92}}=9.250\]
Thus, the #91\%# confidence interval for the population mean #\mu# is:
\[CI_{\mu,\,91\%}=(8.750,\,\,\, 9.250)\]
Since the population standard deviation #\sigma# is unknown, we will have to use the #t#-distribution and the sample standard deviation #s# to construct the confidence interval.
A sample size of #n=92# is considered large enough for the Central Limit Theorem to apply.
This means that, although the sample in question comes from a population having an unknown distribution, the sampling distribution of the sample mean is approximately normal.
Assuming the sampling distribution of the sample mean is (approximately) normal, the general formula for computing a #C\%\, CI# for the population mean #\mu#, based on a random sample of size #n#, is:
\[CI_{\mu}=\bigg(\bar{X} - t^*\cdot \cfrac{s}{\sqrt{n}},\,\,\,\, \bar{X} + t^*\cdot \cfrac{s}{\sqrt{n}} \bigg)\]
For a given confidence level #C# (in #\%#), the critical value #t^*# of the #t_{n-1}# is the value such that #\mathbb{P}(-t^* \leq t \leq t^*)=\cfrac{C}{100}#.
To calculate this critical value #t^*# in R, make use of the following function:
qt(p, df, lower.tail)
- p: A probability corresponding to the normal distribution.
- df: An integer indicating the number of degrees of freedom.
- lower.tail: If TRUE (default), probabilities are #\mathbb{P}(X \leq x)#, otherwise, #\mathbb{P}(X \gt x)#.
Here, we have #C=91#. Thus, to calculate #t^*# such that #\mathbb{P}(-t^* \leq t \leq t^*)=0.91#, run the following command:
\[\begin{array}{c}
\text{qt}(p = (100+C)/200, df = n \text{ - } 1, lower.tail = \text{TRUE})\\
\downarrow\\
\text{qt}(p =191/200, df = 92 \text { - } 1, lower.tail = \text{TRUE})
\end{array}\]
This gives:
\[t^* = 1.71364\]
Calculate the lower bound #L# of the confidence interval:
\[L = \bar{X} - t^* \cdot \cfrac{s}{\sqrt{n}} = 9.0 - 1.71364 \cdot \cfrac{1.4}{\sqrt{92}}=8.750\]
Calculate the lower bound #U# of the confidence interval:
\[U = \bar{X} + t^* \cdot \cfrac{s}{\sqrt{n}} = 9.0 + 1.71364 \cdot \cfrac{1.4}{\sqrt{92}}=9.250\]
Thus, the #91\%# confidence interval for the population mean #\mu# is:
\[CI_{\mu,\,91\%}=(8.750,\,\,\, 9.250)\]
#\phantom{0}#
Connection to Hypothesis Testing
There exists a direct connection between a two-sided one-sample #t#-test for #\mu# and a #(1-\alpha)\cdot 100\%# confidence interval for #\mu# based on the #t#-distribution:
- If #\mu_0# falls inside the #(1 - \alpha)\cdot 100\%\,CI#, then #H_0: \mu=\mu_0# should not be rejected at the #\alpha# level of significance.
- If #\mu_0# falls outside of the #(1 - \alpha)\cdot 100\%\,CI#, then #H_0: \mu=\mu_0# should be rejected at the #\alpha# level of significance.
Suppose you use the same sample to test #H_0: \mu = 39# against #H_a: \mu \neq 39# at the #\alpha = 0.02# level of significance.
What would be the conclusion?
Since the #98\%# confidence interval #(41.782,\,\,46.488)# does not contain the value #\mu_0 = 39#, we would reject #H_0: \mu = 39# at the #\alpha = 0.02# level of significance.