In statistics, R-squared (R2) measures the proportion of the variance in the response variable that can be explained by the predictor variable in a regression model.
We use the following formula to calculate R-squared:
R2 = [ (nΣxy – (Σx)(Σy)) / (√nΣx2-(Σx)2 * √nΣy2-(Σy)2) ]2
The following step-by-step example shows how to calculate R-squared by hand for a given regression model.
Step 1: Create a Dataset
First, let’s create a dataset:
Step 2: Calculate Necessary Metrics
Next, let’s calculate each metric that we need to use in the R2 formula:
Step 3: Calculate R-Squared
Lastly, we’ll plug in each metric into the formula for R2:
- R2 = [ (nΣxy – (Σx)(Σy)) / (√nΣx2-(Σx)2 * √nΣy2-(Σy)2) ]2
- R2 = [ (8*(2169) – (72)(223)) / (√8*(818)-(72)2 * √8*(6447)-(223)2) ]2
- R2 =Â 0.6686
Note: The n in the formula represents the number of observations in the dataset and turns out to be n = 8 observations in this example.
Assuming x is the predictor variable and y is the response variable in this regression model, the R-squared for the model is 0.6686.
This tells us that 66.86% of the variation in the variable y can be explained by variable x.
Additional Resources
Introduction to Simple Linear Regression
Introduction to Multiple Linear Regression
R vs. R-Squared: What’s the Difference?
What is a Good R-squared Value?