6. Parameter Estimation and Confidence Intervals: Estimation
Parameter Estimation
Statistical inference is the process of making statements about a population on the basis of sample data. There are two main methods of statistical inference:
- Parameter estimation
- Hypothesis testing
This subchapter will introduce the concept of parameter estimation.
#\phantom{0}#
Parameter Estimation
Parameter estimation is the process of using sample data to estimate the parameters of the population.
#\phantom{0}#
There are two types of parameter estimates: point estimates and confidence intervals.
#\phantom{0}#
Point estimate
A point estimate is a single value which is the best guess for the population parameter.
A downside of point estimation is that it provides no insight into the precision of the estimate.
Sample Statistics as Point Estimates
Examples of point estimation are:
- Using a sample mean #\bar{X}# as an estimate of a population mean #\mu#.
- Using a sample proportion #\hat{p}# as an estimate of a population proportion #\pi#.
#\phantom{0}#
If you want to include a measure of precision in your estimate, compute a confidence interval instead.
#\phantom{0}#
Confidence interval
A confidence interval #(CI)# for a population parameter is a range of values, based on sample data, which are highly plausible candidates for the true value of that population parameter.
Confidence intervals are always accompanied by a corresponding confidence level. The confidence level is the probability that a random confidence interval will enclose the target parameter.
From a practical perspective, the confidence level identifies the fraction of the time, in repeated sampling, that the confidence intervals constructed will contain the true value of the population parameter.
Suppose you are given a #95\%# #CI# for some population parameter that was computed using a specific procedure based on sample data.
This means that, before the sample was selected, there was a #0.95# probability that a sample would be selected that would produce a #CI# containing the true value of that parameter.
This implies that if we were to take #100# simple random samples from the population and use the same procedure to compute a #95\%# #CI# for each of those samples, we would expect about #95# of the #CI#s to contain the true value of the parameter and about #5# of them not to contain it.
Of course, as long as the true value of the parameter is unknown, we cannot tell which #CI#s contain the target parameter and which do not.
Interpretation of a Confidence Interval
Confidence intervals are often misinterpreted. Once a confidence interval has been computed from sample data, it is no longer correct to use the word "probability" in connection with the confidence interval.
It is thus incorrect to state: "There is a #95\%# probability that the true value of the parameter is contained within the confidence interval." After all, the true value of the parameter is not random; it is either in the confidence interval or it is not.
What we can say, however, is: "We are #95\%# confident that the true value of the parameter is contained within the confidence interval."