Inference about the Slope of a Linear Model

9. Simple Linear Regression: Simple Linear Regression

Inference about the Slope of a Linear Model

The linear relationship between two variables $X$ and $Y$ in the population can be expressed with the following equation:

$Y_i = \beta_0 + \beta_1 \cdot X_i + \epsilon_i$

In order to perform statistical inference about the slope $\beta_1$ of a linear model, we will first need to determine the standard error of the slope.
$\phantom{0}$

Standard Error of the Slope

The standard error of the slope $s_{b_1}$ is a measure of the amount of error we can reasonably expect when using sample data to estimate the slope $\beta_1$ of a linear regression model and is calculated with the following formula:

$s_{b_1} = \cfrac{s_E}{\sqrt{SS_X}} = \cfrac{s_E}{\displaystyle \sqrt{ \sum_{i=1}^{n} (X_i - \bar{X})^2 }}$

where $s_E$ is the standard error of the estimate.

$\phantom{0}$
We will now take a look at two ways we can use the standard error of the slope to perform statistical inference about the slope coefficient $\beta_1$ of a linear regression model.

The calculation of $s_{b_1}$ assumes that the errors $\epsilon_i$ in the population are independent and normally distributed with a constant, but unknown variance ( $s^2_\epsilon$ ). The variance $s^2_\epsilon$ is estimated by $s^2_E$ . Because two means are calculated from the data in estimating the variance of the intercept and slope parameters, we are left with $n-2$ free-to-use data points on which the inference can be based. That is why inference for simple regression uses the $t$ -distribution with $df=n-2$ ( $df$ = degrees of freedom).
$\phantom{0}$

Confidence Interval for the Slope of a Linear Model

The general formula for computing a $C\%\,CI$ for the slope $\beta_1$ is:

$CI_{\beta_1}=\bigg(b_1 - t^*\cdot s_{b_1},\,\,\,\, b_1 + t^*\cdot s_{b_1} \bigg)$

Where $t^*$ is the critical value of the $t_{n-2}$ distribution such that $\mathbb{P}(-t^* \leq t \leq t^*)=\frac{C}{100}$ .

Calculating t* with Statistical Software

Let $C$ be the confidence level in $\%$ .

To calculate the critical value $t^*$ in Excel, make use of the function T.INV():

$=\text{T.INV}((100+C)/200, n-2)$

To calculate the critical value $t^*$ in R, make use of the function qt():

$\text{qt}(p=(100+C)/200, df=n-2,lower.tail = \text{TRUE})$

$\phantom{0}$
We can also use the standard error of the slope to perform a hypothesis test for the value of the slope $\beta_1$ of a linear regression model.
$\phantom{0}$

Hypothesis Test for the Slope of a Linear Model

The hypotheses of a two-sided test for the slope $\beta_1$ of a linear model are:

$\begin{array}{rcl} H_0: \beta_1 = 0 & (\text{There is no linear relationship between }X \text{ and } Y \text{ in the population})\\ H_a: \beta_1 \neq 0 & (\text{There is a linear relationship between }X \text{ and } Y \text{ in the population})\,\, \end{array}$
The relevant test statistic for the null hypothesis $H_0: \beta_1 = 0$ is:

$t_{b_1} = \cfrac{b_1 - 0}{s_{b_1}} = \cfrac{b_1}{s_{b_1}}$
Under the null hypothesis of the test, $t_{b_1}$ follows a $t$ -distribution with $df = n-2$ degrees of freedom:

$t_{b_1} \sim t_{n-2}$

Calculating the p-value of a Hypothesis Test for the Slope of a Linear Model

The calculation of the $p$ -value of a $t$ -test for $\beta_1$ is dependent on the direction of the test and can be performed using either Excel or R.

To calculate the $p$ -value of a $t$ -test for $\beta_1$ in Excel, make use of one of the following commands:

$\begin{array}{llll} \phantom{0}\text{Direction}&\phantom{0000}H_0&\phantom{0000}H_a&\phantom{000000000}\text{Excel Command}\\ \hline \text{Two-tailed}&H_0:\beta_1 = 0&H_a:\beta_1 \neq 0&=2 \text{ * }(1 \text{ - } \text{T.DIST}(\text{ABS}(t),n\text{ - }2,1))\\ \text{Left-tailed}&H_0:\beta_1 \geq 0&H_a:\beta_1 \lt 0&=\text{T.DIST}(t,n\text{ - }2,1)\\ \text{Right-tailed}&H_0:\beta_1 \leq 0&H_a:\beta_1 \gt 0&=1\text{ - }\text{T.DIST}(t,n\text{ - }2,1)\\ \end{array}$

To calculate the $p$ -value of a $t$ -test for $\beta_1$ in R, make use of one of the following commands:

$\begin{array}{llll} \phantom{0}\text{Direction}&\phantom{0000}H_0&\phantom{0000}H_a&\phantom{00000000000}\text{R Command}\\ \hline \text{Two-tailed}&H_0:\beta_1 = 0&H_a:\beta_1 \neq 0&2 \text{ * }\text{pt}(\text{abs}(t),n\text{ - }2,lower.tail=\text{FALSE})\\ \text{Left-tailed}&H_0:\beta_1 \geq 0&H_a:\beta_1 \lt 0&\text{pt}(t,n\text{ - }2, lower.tail=\text{TRUE})\\ \text{Right-tailed}&H_0:\beta_1 \leq 0&H_a:\beta_1 \gt 0&\text{pt}(t,n\text{ - }2, lower.tail=\text{FALSE})\\ \end{array}$

If $p \lt \alpha$ , reject $H_0$ and conclude $H_a$ . Otherwise, do not reject $H_0$ .