Testing for Differences Between Means

8. Testing for Differences in Means and Proportions: Practical 8

Testing for Differences Between Means

Unpaired two-sample t-test

In the previous practical we used the t.test() function to test whether the mean of a population differed statistically from the situation we expected under the null hypothesis. Now, we will test whether the difference in means between two populations is significant. In R you can use the same t.test() function for this purpose. Let's illustrate this with an example.

In the introduction of the air quality dataset, it was stated that the NO2 concentrations are higher during weekdays than in the weekend. Check this statement with a t-test on the means.

We will go through the process step-by-step.

1) Formalise the hypothesis that you want to test. For example:

$H_0$ : $\mu_{weekday} \leq \mu_{weekend}$
$H_a$ : $\mu_{weekday} > \mu_{weekend}$

2) Select a location and sample the data.

If you take the Amsterdam-Vondelpark as location, you can select all the rows that contain weekday measurements in one dataframe and all the rows that contain weekend measurements in a second dataframe. An easy way to achieve this, is by first creating two vectors; one with the weekdays and one with the weekend days. Subsequently the %in% operator can be used to check if the day in the dataframe is in the weekday vector or in the weekend vector. This might sound complicated, but if you take a look at the code, you will see that it is very intuitive.

weekdays <- c('Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday')
weekend <- c('Saturday', 'Sunday')
NO2_vp_week <- NO2_vp[NO2_vp$weekday %in% weekdays, ]
NO2_vp_weekend <- NO2_vp[NO2_vp$weekday %in% weekend, ]

3) Check the summaries of the new dataframes to be sure that the sampling is correct.

The column 'weekday' shows the number of observations for each day of the week.

summary(NO2_vp_week)
summary(NO2_vp_weekend)

4) perform the t-test using the t.test() function.

This time you will need to fill in both the arguments x and y, because this is a two-sample t-test. Here we use a confidence level of $99\%$ .

t.test(x=NO2_vp_week$value, y=NO2_vp_weekend$value, alternative = 'greater', conf.level = 0.99)

      Welch Two Sample t-test

data:  NO2_vp_week$value and NO2_vp_weekend$value
t = 9.6897, df = 1062.9, p-value < 2.2e-16
alternative hypothesis: true difference in means is greater than 0
99 percent confidence interval:
 3.810699      Inf
sample estimates:
mean of x mean of y 
 25.68235  20.66532

5) interpret the results.

These results indicate that the true difference in means is greater than zero if we use a significance level of $\alpha = 0.01$ . We can therefore reject the null-hypothesis.

New example

Paired two-sample t-test

In the previous section, we did compare averages between two samples (e.g. week and weekend days). However, sometimes such a comparison is not that relevant. A good example of such a situation is if you would compare air quality at different stations. The air quality differs considerably between week and weekend days but also due to weather effects. These are non-location-specific effects that probably impact every station in Amsterdam in a similar way. So by taking the difference between station-values on the same day, you get a much more precise measure of station-differences than by comparing station averages.

This way of relating the two groups in a hypothesis test is called 'pairing': the subjects in the two variables are related (also called 'dependent') and can therefore be treated as pairs.

If there is a possibility to apply pairing, it is always a good idea to do so because it removes variance from the data by external factors and makes the test more powerful.

In the t.test() function there is an option to specify whether two samples are paired. Before we check how that works, we will first step back and see how the hypotheses for a paired test are formulated.

If we want to test whether the (unpaired) average concentrations are different at the Amsterdam-Vondelpark and the Amsterdam-Stadhouderskade stations, the hypotheses are

$H_0$ : $\mu_{vp} = \mu_{shk}$
$H_a$ : $\mu_{vp} \neq \mu_{shk}$

in which vp stands for Amsterdam-Vondelpark and shk for Amsterdam-Stadhouderskade.

And if we want to test whether the paired concentrations are different, the hypotheses are

$H_0$ : $\mu_{vp-shk} = 0$
$H_a$ : $\mu_{vp-shk} \neq 0$

here $\mu_{vp-shk}$ is the mean of the differences at Vondelpark and Stadhouderskade.

So essentially, by pairing the test becomes a one-sample test. Let's now see how a paired two-sample t-test works by comparing the NO2 concentrations at the Amsterdam-Vondelpark and the Amsterdam-Stadhouderskade stations on the same days.

Test whether the mean NO2 concentration at the Amsterdam-Vondelpark station is statistically different from the Amsterdam-Stadhouderskade station on the same days.
Decide first whether you need a paired or an unpaired t-test for this.

Give the value of the test-statistic as your answer to this question. Use 1 decimal in your answer.

In this question you need a paired t-test. The test-statistic for a t-test is the t-value. And the t-value resulting from this question is: $-78.3$ .

The key data preparation step for our analysis is to match the right values from the two stations that are going to be compared. We already have the dataframes with the NO2 concentrations separated for both stations. So now we will check whether each date is present in both dataframes. The following code selects the rows with the dates for each location that are present also in the other dataframe.

NO2_vp_paired <- NO2_vp[NO2_vp$date %in% NO2_shk$date,]
NO2_shk_paired <- NO2_shk[NO2_shk$date %in% NO2_vp$date,]

With the summary() command you can check if both dataframes contain the same number of observations.

summary(NO2_vp_paired)
summary(NO2_shk_paired)

Good, both dataframes contain $1779$ observations. The dates should be already ordered, this ensures that the pairs are lined up in the dataframes and that you can compare them. However, it is always better to check this. The easiest way to do this, is checking whether there are dates that are not lined up.

which(NO2_vp_paired$date != NO2_shk_paired$date)

There are no dates that are differing, so you can perform the paired t-test. Let's use the default significance level of 0.05 for this test.

t.test(x = NO2_vp_paired$value, y = NO2_shk_paired$value, alternative = "two.sided", paired = TRUE)

	Paired t-test

data:  NO2_vp_paired$value and NO2_shk_paired$value
t = -78.274, df = 1778, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -12.07983 -11.48927
sample estimates:
mean of the differences 
              -11.78455

The $p$ -value is < $0.01$ . We can therefore reject the null-hypothesis; the population means of the NO2 concentrations at these two locations are significantly different. We can state with $95\%$ confidence that the true difference between the NO2 concentrations at the Vondelpark and the Stadhouderskade lies in between $-12.08$ and $-11.49$ μg/m3 (approximately).

Note:

From the output that is presented, you can not state with certainty whether this means that Vondelpark values are lower than those at Stadhouderskade or vice versa - it just depends on which sample is subtracted from which. But you can quickly calculate the difference between the means:

mean(NO2_vp_paired$value) - mean(NO2_shk_paired$value)

Which results in a value of $-11.78$ , so Vondelpark has a lower mean NO2 concentration.

New example

Here we have conducted two-sided tests. This the best choice when you have no strong idea about the direction of the effect. However, if you expect there to be a difference prior to the test (either based on theory or prior observations) you might specify a one-sided hypothesis test. For example, because there is much more traffic at the Stadhouderskade than along the Vondelpark, we would expect the concentration of NO2 as well as PM10 to be higher at the former. So when comparing these two locations you might want to specify a one-sided hypothesis test.