7. Hypothesis Testing: Introduction to Hypothesis Testing
Computing the Test Statistic and Making a Decision
Once the hypotheses of the test have been formulated and the significance level of the test has been set, it is time to collect the sample data and compute the test statistic.
#\phantom{0}#
Test Statistic
A test statistic is a single numerical value that quantifies the difference between the observed sample data and what you would expect to observe if the null hypothesis of the test is true.
Definition
A test statistic is generally composed of a ratio.
The numerator of the ratio is the obtained difference between the sample statistic and the hypothesized population parameter.
The denominator of the ratio is the standard error which measures how much difference is expected by chance.
Formula
\[\text{test statistic}=\cfrac{\text{obtained difference}}{\text{expected difference}}\]
#\phantom{0}#
In general, larger test statistics are indicative of stronger evidence against the null hypothesis being tested.
If the value of the test statistic falls inside the critical region, the null hypothesis is rejected. If the test statistic does not fall inside the critical region, the null hypothesis is not rejected.
The test statistic used in a #z#-test is the #Z#-statistic.
#\phantom{0}#
Z-Statistic
The #{Z}#-statistic for a #z#-test for a population mean #\mu# is obtained by transforming the sample mean #\bar{X}# into a #Z#-score:
\[Z=\cfrac{\bar{X}-\mu_0}{\sigma_{\bar{X}}} =\cfrac{\bar{X}-\mu_0}{\sigma/\sqrt{n}}\]
If the population from which the sample is drawn is normally distributed, then the sampling distribution of the #Z#-statistic is the Standard Normal Distribution. That is #Z\sim N(0,1)#.
If the population is not normally distributed, but the sample size is large (#n>30#), the Central Limit Theorem allows us to proceed as if #Z\sim N(0,1)#.
Random variables are usually represented by a capital letter and specific values of a random variable with the corresponding lowercase letter. Thus, a lowercase #z# will be used to denote the measured value of #Z# after the sample data has been collected. The value of #z# is used to assess the strength of the evidence against the null hypothesis.
As the difference between the observed sample mean #\bar{X}# and the hypothesized population mean #\mu_0# increases, the value of #Z# becomes larger and the evidence against the null hypothesis becomes stronger.
4. Example Summer Course: Test Statistic Computation and Decision
After the conclusion of the Summer Course, the students are tested on their statistical knowledge. It turns out that the mean grade of the #100# students who attended the Summer Course is #\bar{X} = 7.0#.
This sample mean is then converted into a #z#-score, which will serve as the test statistic:
\[Z=\dfrac{\bar{X} - \mu_0}{\sigma_{\bar{X}}} = \dfrac{7.0 - 6.5}{0.1} = \dfrac{0.5}{0.1} = 5.00\]
Since #z=5.00 \gt 1.96#, the sample mean falls inside the critical region and the null hypothesis #H_0:\mu_0=6.5# should be rejected.
The university, therefore, concludes that participating in the Summer Course has had a significant impact on the mean grade of the students.