One of the most common metrics used to measure the prediction accuracy of a model is MSE, which stands for mean squared error. It is calculated as:
MSE = (1/n) * Σ(actual – prediction)2
where:
- Σ – a fancy symbol that means “sum”
- n – sample size
- actual – the actual data value
- prediction – the predicted data value
The lower the value for MSE, the more accurately a model is able to predict values.
How to Calculate MSE in R
Depending on what format your data is in, there are two easy methods you can use to calculate the MSE of a regression model in R.
Method 1: Calculate MSE from Regression Model
In one scenario, you may have a fitted regression model and would simply like to calculate the MSE of the model. For example, you may have the following regression model:
#load mtcars dataset data(mtcars) #fit regression model model #get model summary model_summ
To calculate the MSE for this model, you can use the following formula:
#calculate MSE
mean(model_summ$residuals^2)
[1] 8.85917
This tells us that the MSE is 8.85917.
Method 2: Calculate MSE from a list of Predicted and Actual Values
In another scenario, you may simply have a list of predicted and actual values. For example:
#create data frame with a column of actual values and a column of predicted values
data #view first six lines of data
head(data)
pred actual
Mazda RX4 23.14809 21.0
Mazda RX4 Wag 23.14809 21.0
Datsun 710 25.14838 22.8
Hornet 4 Drive 20.17416 21.4
Hornet Sportabout 15.46423 18.7
Valiant 21.29978 18.1
In this case, you can use the following formula to calculate the MSE:
#calculate MSE
mean((data$actual - data$pred)^2)
[1] 8.85917
This tells us that the MSE is 8.85917, which matches the MSE that we calculated using the previous method.