A residual is the difference between an observed value and a predicted value in a regression model.
It is calculated as:
Residual = Observed value – Predicted value
One way to understand how well a regression model fits a dataset is to calculate the residual sum of squares, which is calculated as:
Residual sum of squares = Σ(ei)2
where:
- Σ: A Greek symbol that means “sum”
- ei: The ith residual
The lower the value, the better a model fits a dataset.
This tutorial provides examples of how to calculate the residual sum of squares for a simple linear regression model and a multiple linear regression model in Excel.
Example 1: Residual Sum of Squares for Simple Linear Regression
Suppose we have the following dataset in Excel:
To calculate the residual sum of squares for a simple linear regression model using x as the predictor variable and y as the response variable we can use the LINEST() function, which uses the following syntax:
LINEST(known_ys, [known_xs], [const], [stats])
where:
- known_ys: The range of y-values
- known_sx: The range of x-values
- const: Whether to force the constant b to be zero. We will leave this blank.
- stats: A list of regression statistics. We will specify this to be TRUE.
The following screenshot shows how to use this function in practice:
The residual sum of squares for the regression model is displayed in the last cell of the second column of the output. In this example, the residual sum of squares turns out to be 50.75.
Example 2: Residual Sum of Squares for Multiple Linear Regression
Suppose we have the following dataset in Excel:
Once again we can use the LINEST() function to calculate the residual sum of squares for the model.
The only difference is that we’ll specify two columns of values for the known_xs argument:
The residual sum of squares for this multiple linear regression model turns out to be 49.83.
Additional Resources
How to Perform Simple Linear Regression in Excel
How to Perform Multiple Linear Regression in Excel
Residual Sum of Squares Calculator