Confidence Interval for the Difference Between Two Independent Proportions

8. Testing for Differences in Means and Proportions: Independent Proportions Z-test

Confidence Interval for the Difference Between Two Independent Proportions

Assuming the sampling distribution of the difference between two sample proportions is (approximately) normal, the general formula for computing a $C\%\,CI$ for the difference between the two population proportions $\pi_1- \pi_2$ is:

$CI_{(\pi_1 - \pi_2)}=(\hat{p}_1 - \hat{p}_2) \pm z^*\cdot \sqrt{\cfrac{\hat{p}_1 \cdot (1 - \hat{p}_1)}{n_1}+\cfrac{\hat{p}_2 \cdot (1 - \hat{p}_2)}{n_2}}$

Where $z^*$ is the critical value of the Standard Normal Distribution such that $\mathbb{P}(-z^* \leq Z \leq z^*) = \cfrac{C}{100})$ .

Calculating z* with Statistical Software

Let $C$ be the confidence level in $\%$ .

To calculate the critical value $z^*$ in Excel, make use of the function NORM.INV():

$=\text{NORM.INV}((100+C)/200, 0, 1)$

To calculate the critical value $z^*$ in R, make use of the function qnorm():

$\text{qnorm}(p=(100+C)/200, mean=0, sd=1,lower.tail = \text{TRUE})$

A simple random sample of size $90$ is selected from Amsterdam residents, of which $X_1=38$ have a Dutch museum card. Meanwhile, a simple random sample of size $96$ is selected from Rotterdam residents, of which $X_2=27$ have a Dutch museum card.

Construct a $93\%$ confidence interval for the difference between the two population proportions $\pi_1 - \pi_2$ . Round your answers to $3$ decimal places.

$CI_{(\pi_1 - \pi_2),\,93\%}=(0.015,\,\,\, 0.267)$

There are a number of different ways we can compute the confidence interval. Click on one of the panels to toggle a specific solution.

Excel Calculation

Since both $n_1$ and $n_2$ are considered large ( $\gt 30$ ), the Central Limit Theorem applies and we know that sampling distribution of the difference between two sample proportions is (approximately) normal.

If the sampling distribution of the difference between two sample proportions is (approximately) normal, the general formula for computing a $C\%\,CI$ for the difference between the two population proportions $\pi_1- \pi_2$ is:

$CI_{(\pi_1 - \pi_2)}=(\hat{p}_1 - \hat{p}_2) \pm z^*\cdot \sqrt{\cfrac{\hat{p}_1 \cdot (1 - \hat{p}_1)}{n_1}+\cfrac{\hat{p}_2 \cdot (1 - \hat{p}_2)}{n_2}}$
Compute the sample proportions $\hat{p}_1$ and $\hat{p}_2$ :

$\hat{p}_1=\cfrac{X_1}{n_1}=\cfrac{38}{90}=0.42222\\ \hat{p}_2=\cfrac{X_2}{n_2}=\cfrac{27}{96}=0.28125$
For a given confidence level $C$ (in $\%$ ), the critical value $z^*$ of the standard normal distribution is the value such that $\mathbb{P}(-z^* \leq Z \leq z^*)=\cfrac{C}{100}$ .

To calculate this critical value $z^*$ in Excel, make use of the following function:

NORM.INV(probability, mean, standard_dev)

probability: A probability corresponding to the normal distribution.

mean: The mean of the distribution.

standard_dev: The standard deviation of the distribution.

Here, we have $C=93$ . Thus, to calculate $z^*$ such that $\mathbb{P}(-z^* \leq Z \leq z^*)=0.93$ , run the following command:

$\begin{array}{c} =\text{NORM.INV}((100+C)/200, 0, 1)\\ \downarrow\\ =\text{NORM.INV}(193/200, 0, 1) \end{array}$
This gives:

$z^* = 1.81191$
Calculate the lower bound $L$ of the confidence interval:

$\begin{array}{rcl} L &=& (\hat{p}_1 - \hat{p}_2) - z^*\cdot \sqrt{\cfrac{\hat{p}_1 \cdot (1 - \hat{p}_1)}{n_1}+\cfrac{\hat{p}_2 \cdot (1 - \hat{p}_2)}{n_2}}\\ &=& (0.42222 - 0.28125) - 1.81191 \cdot \sqrt{\cfrac{0.42222 \cdot (1 - 0.42222)}{90}+\cfrac{0.28125 \cdot (1 - 0.28125)}{96}}\\ &=&0.015 \end{array}$
Calculate the upper bound $U$ of the confidence interval:

$\begin{array}{rcl} U &=& (\hat{p}_1 - \hat{p}_2) + z^*\cdot \sqrt{\cfrac{\hat{p}_1 \cdot (1 - \hat{p}_1)}{n_1}+\cfrac{\hat{p}_2 \cdot (1 - \hat{p}_2)}{n_2}}\\ &=& (0.42222 - 0.28125) + 1.81191 \cdot \sqrt{\cfrac{0.42222 \cdot (1 - 0.42222)}{90}+\cfrac{0.28125 \cdot (1 - 0.28125)}{96}}\\ &=&0.267 \end{array}$
Thus, the $93\%$ confidence interval for the difference between the two population proportions $\pi_1 - \pi_2$ is:

$CI_{(\pi_1 - \pi_2),\,93\%}=(0.015,\,\,\, 0.267)$

R Calculation

To calculate this critical value $z^*$ in R, make use of the following function:

qnorm(p, mean, sd, lower.tail)

p: A probability corresponding to the normal distribution.

mean: The mean of the distribution.

sd: The standard deviation of the distribution.

lower.tail: If TRUE (default), probabilities are $\mathbb{P}(X \leq x)$ , otherwise, $\mathbb{P}(X \gt x)$ .

Here, we have $C=93$ . Thus, to calculate $z^*$ such that $\mathbb{P}(-z^* \leq Z \leq z^*)=0.93$ , run the following command:

$\begin{array}{c} \text{qnorm}(p = (100+C)/200, mean = 0, sd = 1, lower.tail = \text{TRUE})\\ \downarrow\\ \text{qnorm}(p =193/200, mean = 0, sd = 1, lower.tail = \text{TRUE}) \end{array}$
This gives:

$z^* = 1.81191$
Calculate the lower bound $L$ of the confidence interval:

$CI_{(\pi_1 - \pi_2),\,93\%}=(0.015,\,\,\, 0.267)$

New example