Formulas, Statistical Tables and R Commands: Formulas
Formulas regression
1. Simple linear regression
Regression equation simple linear regression
where the regression coefficient is estimated by
and the intercept is estimated by
Residual
The residual (or prediction error) is
where is the observed value and the predicted value for person .
Sums of squares for y
where is the total sum of squares of , is the regression sum of squares 'explained' by the model and is the residual sum of squares.
Proportion explained variation
The proportion explained variation (also called the proportional reduction in prediction error) is
-test for regressin
The test statistic for regression coefficient assuming : is
where is calculated by software. The statistic follows a distribution with degrees of freedom (), when the assumptions hold.
Standardized residual
The standardized residual equals
where , the standard error for the residual (also referred to as ) is calculated by software.
Residual standard deviation
The residual standard deviation based on observations equals
in other words: , where equals the number of parameters in the regression equation ( for simple regression).
- prediction interval for
where is the residual standard deviation and 2 is an approximation of .
- confidence interval for
where is the residual standard deviation, 2 is an approximation of and is the number of observations.
2. Multiple linear regression
Regression equation simple multiple regression
For the independent variables (predictors)
Proportion explained variation
The proportion explained variation (proportional reduction in prediction error), or squared multiple correlation coefficient is
In other words: .
Multiple correlation coefficient
-test statistic regression analysis
The null hypothesis that all regression coefficients equal zero is tested using
where equals the number of parameters in the regression equation and the number of observations. The degrees of freedom are en . and are often referred to as and .
denotes mean squares. denotes the residual variance.
Test statistic for
The test statistic for regression coefficient assuming : is
where is calculated by software. The statistic follows a distribution with degrees of freedom, where equals the number of parameters in the regression equation ( for simple regression), when the assumptions hold.
- confidence interval for
3. Exponential regression
Regression equation simple exponential regression
where must hold. This populatie level equation provides the predicted value for the population mean of for a given value of .
4. Logistic regression
Log-odds (logit)
If is a dichotomous (binary) variable taking on values 0 or 1 and we denote as , then:
To calculate the probability from a log-odds value requires the following rule: .
Regression equation simple logistic regression
for which the following holds:
Unlock full access