### 6. Parameter Estimation and Confidence Intervals: Estimation

### Parameter Estimation

Statistical inference is the process of making statements about a population on the basis of sample data. There are two main methods of statistical inference:

- Parameter estimation
- Hypothesis testing

This subchapter will introduce the concept of *parameter estimation*.

#\phantom{0}#

Parameter Estimation

**Parameter estimation** is the process of using sample data to *estimate* the parameters of the population.

#\phantom{0}#

There are two types of parameter estimates: *point estimates *and* confidence intervals.*#\phantom{0}#

Point estimate

A **point estimate **is a single value which is the best guess for the population parameter.

A downside of point estimation is that it provides no insight into the *precision* of the estimate.

Sample Statistics as Point Estimates

Examples of point estimation are:

- Using a sample mean #\bar{X}# as an estimate of a population mean #\mu#.
- Using a sample proportion #\hat{p}# as an estimate of a population proportion #\pi#.

#\phantom{0}#

If you want to include a measure of precision in your estimate, compute a *confidence interval* instead.

#\phantom{0}#

Confidence interval

A **confidence interval **#(CI)# for a population parameter is a range of values, based on sample data, which are highly plausible candidates for the true value of that population parameter.

Confidence intervals are always accompanied by a corresponding *confidence level*. The **confidence level** is the probability that a random confidence interval will enclose the target parameter.

From a practical perspective, the confidence level identifies the fraction of the time, in repeated sampling, that the confidence intervals constructed will contain the true value of the population parameter.

Suppose you are given a #95\%# #CI# for some population parameter that was computed using a specific procedure based on sample data.

This means that, before the sample was selected, there was a #0.95# probability that a sample would be selected that would produce a #CI# containing the true value of that parameter.

This implies that if we were to take #100# simple random samples from the population and use the same procedure to compute a #95\%# #CI# for each of those samples, we would expect about #95# of the #CI#s to contain the true value of the parameter and about #5# of them not to contain it.

Of course, as long as the true value of the parameter is unknown, we cannot tell which #CI#s contain the target parameter and which do not.

Interpretation of a Confidence Interval

Confidence intervals are often misinterpreted. Once a confidence interval has been computed from sample data, it is no longer correct to use the word "probability" in connection with the confidence interval.

It is thus incorrect to state: "There is a #95\%# probability that the true value of the parameter is contained within the confidence interval." After all, the true value of the parameter is not random; it is either in the confidence interval or it is not.

What we can say, however, is: "We are #95\%# *confident *that the true value of the parameter is contained within the confidence interval."