Confidence Interval for μ when σ is Unknown

7. Hypothesis Testing: One-sample t-test

Confidence Interval for μ when σ is Unknown

Thus far, we have used the following formula to compute a $C\%\,CI$ for a population mean $\mu$ :

$CI_{\mu}=\bigg(\bar{X} - z^*\cdot \cfrac{\sigma}{\sqrt{n}},\,\,\,\, \bar{X} + z^*\cdot \cfrac{\sigma}{\sqrt{n}} \bigg)$

Just like the calculation of the $Z$ -statistic, this calculation requires the population standard deviation $\sigma$ to be known.

When $\sigma$ is unknown, we will have to rely on the $t$ -distribution and the sample standard deviation $s$ to compute the confidence interval.
$\phantom{0}$

Confidence Interval for a Population Mean when σ is Unknown

Assuming the sampling distribution of the sample mean is (approximately) normal, the general formula for computing a $C\%$ CI for a population mean $\mu$ , based on a random sample of size $n$ , when σ is unknown, is:

$CI_{\mu}=\bigg(\bar{X} - t^*\cdot \cfrac{s}{\sqrt{n}},\,\,\,\, \bar{X} + t^*\cdot \cfrac{s}{\sqrt{n}} \bigg)$

Where $t^*$ is the critical value of the $t_{n-1}$ distribution such that $\mathbb{P}(-t^* \leq t \leq t^*)=\frac{C}{100}$ .

Calculating t* with Statistical Software

Let $C$ be the confidence level in $\%$ .

To calculate the critical value $t^*$ in Excel, make use of the function T.INV():

$=\text{T.INV}((100+C)/200, n \text{ - } 1)$

To calculate the critical value $t^*$ in R, make use of the function qt():

$\text{qt}(p=(100+C)/200, df=n \text{ - } 1,lower.tail = \text{TRUE})$

A professor at a university wants to estimate the time his students spend on their homework.

He collects a random sample of $37$ students. On average, these students spent $6.0$ hours a week on their homework with a standard deviation of $s=1.5$ hours.

Construct a $95\%$ confidence interval for the population mean $\mu$ . Round your answers to $3$ decimal places.

$CI_{\mu,\,95\%}=(5.500,\,\,\, 6.500)$

There are a number of different ways we can compute the confidence interval. Click on one of the panels to toggle a specific solution.

Excel Calculation

Since the population standard deviation $\sigma$ is unknown, we will have to use the $t$ -distribution and the sample standard deviation $s$ to construct the confidence interval.

A sample size of $n=37$ is considered large enough for the Central Limit Theorem to apply.

This means that, although the sample in question comes from a population having an unknown distribution, the sampling distribution of the sample mean is approximately normal.

Assuming the sampling distribution of the sample mean is (approximately) normal, the general formula for computing a $C\%\, CI$ for the population mean $\mu$ , based on a random sample of size $n$ , is:

$CI_{\mu}=\bigg(\bar{X} - t^*\cdot \cfrac{s}{\sqrt{n}},\,\,\,\, \bar{X} + t^*\cdot \cfrac{s}{\sqrt{n}} \bigg)$
For a given confidence level $C$ (in $\%$ ), the critical value $t^*$ of the $t_{n-1}$ is the value such that $\mathbb{P}(-t^* \leq t \leq t^*)=\cfrac{C}{100}$ .

To calculate this critical value $t^*$ in Excel, make use of the following function:

T.INV(probability, deg_freedom)

probability: A probability corresponding to the normal distribution.

deg_freedom: The mean of the distribution.

Here, we have $C=95$ . Thus, to calculate $t^*$ such that $\mathbb{P}(-t^* \leq t \leq t^*)=0.95$ , run the following command:

$\begin{array}{c} =\text{T.INV}((100+C)/200, n - 1)\\ \downarrow\\ =\text{T.INV}(195/200, 37 \text{ - } 1) \end{array}$
This gives:

$t^* = 2.02809$
Calculate the lower bound $L$ of the confidence interval:

$L = \bar{X} - t^* \cdot \cfrac{s}{\sqrt{n}} = 6.0 - 2.02809 \cdot \cfrac{1.5}{\sqrt{37}}=5.500$
Calculate the lower bound $U$ of the confidence interval:

$U = \bar{X} + t^* \cdot \cfrac{s}{\sqrt{n}} = 6.0 + 2.02809 \cdot \cfrac{1.5}{\sqrt{37}}=6.500$
Thus, the $95\%$ confidence interval for the population mean $\mu$ is:

$CI_{\mu,\,95\%}=(5.500,\,\,\, 6.500)$

R Calculation

To calculate this critical value $t^*$ in R, make use of the following function:

qt(p, df, lower.tail)

p: A probability corresponding to the normal distribution.

df: An integer indicating the number of degrees of freedom.

lower.tail: If TRUE (default), probabilities are $\mathbb{P}(X \leq x)$ , otherwise, $\mathbb{P}(X \gt x)$ .

Here, we have $C=95$ . Thus, to calculate $t^*$ such that $\mathbb{P}(-t^* \leq t \leq t^*)=0.95$ , run the following command:

$\begin{array}{c} \text{qt}(p = (100+C)/200, df = n \text{ - } 1, lower.tail = \text{TRUE})\\ \downarrow\\ \text{qt}(p =195/200, df = 37 \text { - } 1, lower.tail = \text{TRUE}) \end{array}$
This gives:

$t^* = 2.02809$
Calculate the lower bound $L$ of the confidence interval:

$L = \bar{X} - t^* \cdot \cfrac{s}{\sqrt{n}} = 6.0 - 2.02809 \cdot \cfrac{1.5}{\sqrt{37}}=5.500$
Calculate the lower bound $U$ of the confidence interval:

$U = \bar{X} + t^* \cdot \cfrac{s}{\sqrt{n}} = 6.0 + 2.02809 \cdot \cfrac{1.5}{\sqrt{37}}=6.500$
Thus, the $95\%$ confidence interval for the population mean $\mu$ is:

$CI_{\mu,\,95\%}=(5.500,\,\,\, 6.500)$

New example

$\phantom{0}$

Connection to Hypothesis Testing

There exists a direct connection between a two-sided one-sample $t$ -test for $\mu$ and a $(1-\alpha)\cdot 100\%$ confidence interval for $\mu$ based on the $t$ -distribution:

If $\mu_0$ falls inside the $(1 - \alpha)\cdot 100\%\,CI$ , then $H_0: \mu=\mu_0$ should not be rejected at the $\alpha$ level of significance.
If $\mu_0$ falls outside of the $(1 - \alpha)\cdot 100\%\,CI$ , then $H_0: \mu=\mu_0$ should be rejected at the $\alpha$ level of significance.

A $91\%$ confidence interval for a population mean $\mu$ , computed based on a simple random sample from the population, is $(33.066,\,\, 38.042)$ .

Suppose you use the same sample to test $H_0: \mu = 30$ against $H_a: \mu \neq 30$ at the $\alpha = 0.09$ level of significance.

What would be the conclusion?

Reject $H_0$ .

Since the $91\%$ confidence interval $(33.066,\,\,38.042)$ does not contain the value $\mu_0 = 30$ , we would reject $H_0: \mu = 30$ at the $\alpha = 0.09$ level of significance.

New example