Computing the Test Statistic and Making a Decision

7. Hypothesis Testing: Introduction to Hypothesis Testing

Computing the Test Statistic and Making a Decision

Once the hypotheses of the test have been formulated and the significance level of the test has been set, it is time to collect the sample data and compute the test statistic.
#\phantom{0}#

Test Statistic

A test statistic is a single numerical value that quantifies the difference between the observed sample data and what you would expect to observe if the null hypothesis of the test is true.

Definition

A test statistic is generally composed of a ratio.
The numerator of the ratio is the obtained difference between the sample statistic and the hypothesized population parameter.
The denominator of the ratio is the standard error which measures how much difference is expected by chance.

Formula

\[\text{test statistic}=\cfrac{\text{obtained difference}}{\text{expected difference}}\]

#\phantom{0}#
In general, larger test statistics are indicative of stronger evidence against the null hypothesis being tested.

If the value of the test statistic falls inside the critical region, the null hypothesis is rejected. If the test statistic does not fall inside the critical region, the null hypothesis is not rejected.

The test statistic used in a #z#-test is the #Z#-statistic.
#\phantom{0}#

Z-Statistic

The #{Z}#-statistic for a #z#-test for a population mean #\mu# is obtained by transforming the sample mean #\bar{X}# into a #Z#-score:

\[Z=\cfrac{\bar{X}-\mu_0}{\sigma_{\bar{X}}} =\cfrac{\bar{X}-\mu_0}{\sigma/\sqrt{n}}\]
If the population from which the sample is drawn is normally distributed, then the sampling distribution of the #Z#-statistic is the Standard Normal Distribution. That is #Z\sim N(0,1)#.

If the population is not normally distributed, but the sample size is large (#n>30#), the Central Limit Theorem allows us to proceed as if #Z\sim N(0,1)#.

Random variables are usually represented by a capital letter and specific values of a random variable with the corresponding lowercase letter. Thus, a lowercase #z# will be used to denote the measured value of #Z# after the sample data has been collected. The value of #z# is used to assess the strength of the evidence against the null hypothesis.

As the difference between the observed sample mean #\bar{X}# and the hypothesized population mean #\mu_0# increases, the value of #Z# becomes larger and the evidence against the null hypothesis becomes stronger.

4. Example Summer Course: Test Statistic Computation and Decision

After the conclusion of the Summer Course, the students are tested on their statistical knowledge. It turns out that the mean grade of the #100# students who attended the Summer Course is #\bar{X} = 7.0#.

This sample mean is then converted into a #z#-score, which will serve as the test statistic:

\[Z=\dfrac{\bar{X} - \mu_0}{\sigma_{\bar{X}}} = \dfrac{7.0 - 6.5}{0.1} = \dfrac{0.5}{0.1} = 5.00\]

Since #z=5.00 \gt 1.96#, the sample mean falls inside the critical region and the null hypothesis #H_0:\mu_0=6.5# should be rejected.

The university, therefore, concludes that participating in the Summer Course has had a significant impact on the mean grade of the students.

1. Research Question

A university wants to determine the effectiveness of a new Summer Course aimed at improving the statistical knowledge of its students. From previous years, it is known that the population of students currently has a mean grade of #\mu=6.5# and a standard deviation of #\sigma=1#.

In order to test the impact of the Summer Course, a total of #n=100# students are randomly selected to participate in the Summer Course.

2. Research Hypotheses

The null hypothesis should state the Summer Course has no effect on the students' grades. If this is true, then the mean grade of the students who take the Summer Course will be equal to the mean grade of the students who have not taken the Summer Course.

The alternative hypothesis simply covers all other outcomes.

\[\begin{array}{c}
H_0: \mu_{\text{after summer course}}=6.5\\
H_a: \mu_{\text{after summer course}}\neq 6.5\\
\end{array}\]

3. Critical Region

If the null hypothesis is true, the Summer Course should have no effect on the students' grades and the population of students that participate in the Summer Course will be identical to that of the original population of students; that is, a normal distribution with #\mu = 6.5# and #\sigma=1#.

Next, it is needed to consider all possible outcomes for a sample of #n =100# students. This is the distribution of sample means for #n=100#. If the null hypothesis is true, then the distribution of sample means will have the following properties:

#\phantom{0}#

#\mu_{\bar{X}}= \mu_0 = 6.5#
#\sigma_{\bar{X}} = \cfrac{\sigma}{\sqrt{n}}=\cfrac{1}{\sqrt{100}} = 0.1#

Finally, the distribution of sample means is used to determine the critical region of the test. The university decides to set the alpha level of the test at #\alpha = 0.05#, meaning the critical region consists of the extreme #5\%# of the sampling distribution.

The critical values for a #Z#-test with #\alpha =0.05# are #Z = \pm 1.96#. This means that finding a #Z#-statistic less than #-1.96# or greater than #1.96# should lead to the rejection of the null hypothesis.