8. Testing for Differences in Means and Proportions: Independent Samples t-test
Confidence Interval for the Difference Between Two Independent Means
Confidence Interval for the Difference Between Two Population Means
Assuming the sampling distribution of the difference between two sample means is (approximately) normal, the general formula for computing a for the difference between the two population means is:
Where is the critical value of the distribution such that .
Calculating t* with Statistical Software
Let be the confidence level in .
To calculate the critical value in Excel, make use of the function T.INV():
To calculate the critical value in R, make use of the function qt():
Do boys and girls perform differently on driving tests? To investigate this matter, a researcher selects a simple random sample of boys and girls and gives each of them a driving test.
Each student gets a score from to . These are their test results:
Boys | Girls |
|
|
You may assume that the test scores are approximately normally distributed.
Construct a confidence interval for the difference between the two population means . Round your answers to decimal places.
There are a number of different ways we can compute the confidence interval. Click on one of the panels to toggle a specific solution.
Assuming the test scores are approximately normally distributed, we know that sampling distribution of the difference between two sample means is (approximately) normal as well.
If the sampling distribution of the difference between two sample means is (approximately) normal, the general formula for computing a for the difference between the two population means is:
Determine the degrees of freedom:
For a given confidence level (in ), the critical value of the is the value such that .
To calculate this critical value in Excel, make use of the following function:
T.INV(probability, deg_freedom)
- probability: A probability corresponding to the normal distribution.
- deg_freedom: The mean of the distribution.
Here, we have . Thus, to calculate such that , run the following command:
This gives:
Calculate the lower bound of the confidence interval:
Calculate the lower bound of the confidence interval:
Thus, the confidence interval for the difference between the two population means is:
Assuming the test scores are approximately normally distributed, we know that sampling distribution of the difference between two sample means is (approximately) normal as well.
If the sampling distribution of the difference between two sample means is (approximately) normal, the general formula for computing a for the difference between the two population means is:
Determine the degrees of freedom:
For a given confidence level (in ), the critical value of the is the value such that .
To calculate this critical value in R, make use of the following function:
qt(p, df, lower.tail)
- p: A probability corresponding to the normal distribution.
- df: An integer indicating the number of degrees of freedom.
- lower.tail: If TRUE (default), probabilities are , otherwise, .
Here, we have . Thus, to calculate such that , run the following command:
This gives:
Calculate the lower bound of the confidence interval:
Calculate the lower bound of the confidence interval:
Thus, the confidence interval for the difference between the two population means is:
Connection to Hypothesis Testing
There exists a direct connection between a two-sided independent samples -test for and a confidence interval for :
- If falls inside the , then should not be rejected at the level of significance.
- If falls outside of the , then should be rejected at the level of significance.
Suppose you use the same samples to test against at the level of significance.
What would be the conclusion?
Since the confidence interval does not contain the value , we would reject at the level of significance.