*34*

Two terms that students often get confused in statistics are **R** and **R-squared**, often written R^{2}.

In the context of simple linear regression:

**R:**The correlation between the predictor variable, x, and the response variable, y.**R**The proportion of the variance in the response variable that can be explained by the predictor variable in the regression model.^{2}:

And in the context of multiple linear regression:

**R:**The correlation between the observed values of the response variable and the predicted values of the response variable made by the model.**R**The proportion of the variance in the response variable that can be explained by the predictor variables in the regression model.^{2}:

Note that the value for R^{2} ranges between 0 and 1. The closer the value is to 1, the stronger the relationship between the predictor variable(s) and the response variable.

The following examples show how to interpret the R and R-squared values in both simple linear regression and multiple linear regression models.

**Example 1: Simple Linear Regression**

Suppose we have the following dataset that shows the hours studied and exam score received by 12 students in a certain math class:

Using statistical software (like Excel, R, Python, SPSS, etc.), we can fit a simple linear regression model using “study hours” as the predictor variable and “exam score” as the response variable.

We can find the following output for this model:

Here’s how to interpret the R and R-squared values of this model:

**R:**The correlation between hours studied and exam score is**0.959**.**R**: The R-squared for this regression model is^{2}**0.920**. This tells us that 92.0% of the variation in the exam scores can be explained by the number of hours studied.

Also note that the R^{2} value is simply equal to the R value, squared:

R^{2} = R * R = 0.959 * 0.959 = **0.920**

**Example 2: Multiple Linear Regression**

Suppose we have the following dataset that shows the hours studied, current student grade, and exam score received by 12 students in a certain math class:

Using statistical software, we can fit a multiple linear regression model using “study hours” and “current grade” as the predictor variables and “exam score” as the response variable.

We can find the following output for this model:

Here’s how to interpret the R and R-squared values of this model:

**R:**The correlation between the actual exam scores and the predicted exam scores made by the model is**0.978**.**R**: The R-squared for this regression model is^{2}**0.956**. This tells us that 95.6% of the variation in the exam scores can be explained by the number of hours studied and the student’s current grade in the class.

Also note that the R^{2} value is simply equal to the R value, squared:

R^{2} = R * R = 0.978 * 0.978 = **0.956**

**Additional Resources**

What is a Good R-squared Value?

A Gentle Guide to Sum of Squares: SST, SSR, SSE