8. Testing for Differences in Means and Proportions: Paired Samples t-test
Paired Samples t-test: Purpose, Hypotheses, and Assumptions
In this chapter, we will consider research designs in which a continuous variable is measured twice on a simple random sample of #n# subjects. These two measurements of the same variable will be denoted by #X# and #Y#. Together, these two scores form a matched pair for each subject in the sample.
#\phantom{0}#
Paired Data Research Designs
Typical research designs that produce paired data are:
- Subjects are given a pre-test whose score is #X#, then some treatment is given, then they are given a post-test whose score is #Y#.
- Subjects are measured under two different conditions at two different times. Under such circumstances, it is important that the order of the conditions is randomized to prevent an order effect from occurring. Then #X# is the measurement under one condition and #Y# is the measurement under the other.
- Subjects are not individual people by dyads (pairs of people), such as twins, or couples in a relationship. Then #X# is measured on one member of the couple and #Y# is measured on the other.
- Measurements are taken on two different parts of the body, such as the left and right arm, or the left and right eye. Then #X# is measured on one part and #Y# on the other.
#\phantom{0}#
In such cases, we are not necessarily interested in making inferences about or #X# or #Y#. Rather, we want to draw conclusions about the difference #D# between them.
Whether you define the difference as #D=X-Y# or #D=Y-X# does not matter for the outcome of the statistical test, as long as you remain consistent in your choice of how the difference is computed.
Let #\mu_D# denote the unknown mean difference for the matched pairs if #D# was measured on the entire population, and #\sigma_D# the unknown standard deviation. To conduct inferences about #\mu_D#, a paired samples #t#-test should be used.
#\phantom{0}#
Paired samples t-test: Purpose and Hypotheses
The paired samples #\boldsymbol{t}#-test is used to test hypotheses about the mean difference #\mu_D# between two paired samples.
Specifically, the test is used to determine whether or not it is plausible that #\mu_D# differs from some value #\Delta#. In most situations #\Delta=0#, so we will only present this specific setting.
The hypotheses of a two-tailed paired samples #t#-test are fairly straightforward:
\[H_0: \mu_D = 0\]
\[H_a: \mu_D \neq 0\]
The hypotheses for one-tailed paired samples t-tests are a bit trickier to formulate, however, as they depend on the definition of the difference score #D# and the expectations of the researcher.
If you define #D# in such a way that the mean difference #\mu_D# is expected to be positive, a right-tailed test should be used:
\[H_0: \mu_D \leq 0\]
\[H_a: \mu_D \gt 0\]
If, on the other hand, you define #D# in such a way that the mean difference #\mu_D# is expected to be negative, a left-tailed test should be used:
\[H_0: \mu_D \geq 0\]
\[H_a: \mu_D \lt 0\]
Assumptions of the Paired Samples t-test
The following assumptions are required to hold in order for a paired samples t-test to produce valid results:
- Random sampling is used to draw the samples.
- The two samples are related.
- The sampling distribution of the sample mean difference is approximately normally distributed. This condition of normality is met under the following circumstances:
- If the sample is small #(n \lt 30)#, it is required that the difference scores are normally distributed:
\[D\sim N(\mu_D, \sigma_D)\] - If the sample is sufficiently large #(n \geq 30)#, the Central Limit Theorem can be invoked and this requirement is not needed.
- If the sample is small #(n \lt 30)#, it is required that the difference scores are normally distributed: