4. Probability Distributions: Common Distributions
The Normal Distribution
The normal distribution is arguably the most important distribution in all of statistics. Normal distributions show up in nature all the time. For example, many human characteristics such as height, weight, and intelligence are normally distributed. The normal distribution also plays a very important role in inferential statistics, as many inferential techniques are in some way based on the assumption that the data is normally distributed.
#\phantom{0}#
A normal or Gaussian distribution is continuous, symmetric, unimodal, bell-shaped, and asymptotic* to the horizontal axis.
The mean, mode, and median of a normal distribution all coincide with the same point in the center of the distribution.
The shorthand notation for a normal distribution with mean #\mu# and standard deviation #\sigma# is #N(\mu, \sigma)#.
#\phantom{0}#
The normal distribution is described by the following formula:
\[f(x) = \dfrac{1}{\orange{\sigma} \sqrt{2\pi}}e^{-\dfrac{(x-\blue{\mu})^2}{2\orange{\sigma}^2}}\]
Fortunately, there is no need to memorize this formula. The key takeaway is that there are #2# variables that determine the specific characteristics of a normal distribution, namely the mean of the distribution #\blue{\mu}# and the standard deviation #\orange{\sigma}#.
- Changing #\blue{\mu}# shifts the entire curve to the left or right. Another way to look at it is that the value of the mean determines the #\blue{\text{position}}# of the distribution along the scale.
- Changing #\orange{\sigma}# either clusters scores closer together or spreads them out. In other words, the standard deviation determines the #\orange{\text{shape}}# of the normal distribution.
#\phantom{0}#
#\phantom{0}#
Although there are many different normal distributions, they all share the following properties with regards to the area under their curve:
- The total area under a normal curve is always equal to #1#.
- The area to the left of the mean is exactly equal to the area to the right of the mean, namely #0.5#.
- The Empirical Rule holds.
Empirical Rule
Whenever a distribution of scores is normally distributed, it is possible to use the Empirical Rule to make the following statements about the spread of the scores around the mean:
- Approximately 68% of scores will be located within 1 standard deviation from the mean #(\mu \pm 1\sigma)#.
- Approximately 95% of scores will be located within 2 standard deviations from the mean #(\mu \pm 2\sigma)#.
- Approximately 99.7% of scores will be located within 3 standard deviations from the mean #(\mu \pm 3\sigma)#.
#\phantom{0}#
A special case of the normal distribution is the Standard Normal Distribution.
#\phantom{0}#
Standard Normal Distribution
The Standard Normal Distribution is a normal distribution with mean #\mu=0# and standard deviation #\sigma=1#.
Let #X# be a continuous random variable. If #X\sim N(\mu, \sigma)#, then #Z# follows the Standard Normal Distribution. That is:
\[Z=\cfrac{X-\mu}{\sigma}\sim N(0,1)\]