7. Hypothesis Testing: Practical 7
Introduction to Hypothesis Testing
Objectives
Learn how to do the following in R
- Apply the basic procedure for hypothesis testing through a simulation exercise
- Show how (un)likely a certain extreme outcome is by comparing it to a 'null-distribution'
Instruction
- Read through the text below
- Execute code-examples and compare your results with what is explained in the text
- Make the exercises
- Time: 30 minutes
The basic idea behind Null Hypothesis testing
Null hypothesis testing is a formal approach to decide between two interpretations of a statistical effect from a predictor variable (or 'treatment') on a population parameter. One interpretation is called the null hypothesis (often symbolized H0 and read as “H-naught”). This is the idea that there is no effect from the predictor on the population parameter of interest and that any effect in the sample reflects something that happened by chance, due to sampling error. An example of an H0 could be "Average class attendance is unrelated to rain". The other interpretation is called the alternative hypothesis (often symbolized as H1 or as Ha). This is the idea that there is an effect on the population parameter by the predictor variable and that this effect is observed in the sample. For example, "Average class attendance is lower when it rains".
Other examples:
H0: There is no difference in average IQ between males and females (#\mu_{IQmale} = \mu_{IQfemale}#)
H1: There is a difference in average IQ between males and females (#\mu_{IQmale} \neq \mu_{IQfemale}#)
H0: The average IQ of males is greater than or equal to 100 (#\mu_{IQmale} >= 100#)
H1: The average IQ of males is less than 100 (#\mu_{IQmale} < 100#)
Note that the H0 does always include a statement about equivalence (or an equal sign), and H1 always includes a statement about non-equivalence (not equal, larger or smaller).
Another note: null hypothesis testing is sometimes called 'Null Hypothesis Statistical Testing' and also abbreviated as NHST.
Any effect in sampling statistics can be interpreted in either of these two ways: the effect might not reflect that of the population parameter but rather have occurred by chance, or it might reflect an effect of the population parameter. So researchers need a formal method to decide which of these two is the case. Although there are many specific statistical tests, they are all based on the same general logic of null hypothesis testing. The steps are as follows:
- Assume for the moment that the null hypothesis is true. There is no effect in the population. E.g. you assume that class attendance is the same, regardless of whether it rains or not.
- Determine how likely the sample relationship would be if the null hypothesis were true. You find that class attendance is 0.6 when it rains and 0.8 when it does not rain - how likely is this result if there was no effect in reality.
- If the sample relationship is extremely unlikely, then reject the null hypothesis in favor of the alternative hypothesis (class attendance is influenced by rain). If it would not be extremely unlikely, then retain the null hypothesis.
“Null Hypothesis” from http://imgs.xkcd.com/comics/null_hypothesis.png (CC-BY-NC 2.5)