*37*

**Simple linear regressionÂ **isÂ a method we can use to understand the relationship between a predictor variable and a response variable.

This tutorial explains how to perform simple linear regression in SPSS.

**Example: Simple Linear Regression in SPSS**

Suppose we have the following dataset that shows the number of hours studied and the exam score received by 20 students:

Use the following steps to perform simple linear regression on this dataset to quantify the relationship between hours studied and exam score:

**Step 1: Visualize the data.**

First, weâ€™ll create a scatterplot to visualize the relationship between hours and score to make sure that the relationship between the two variables appears to be linear. Otherwise, simple linear regression wonâ€™t be an appropriate technique to use.

Click theÂ **Graphs** tab, then clickÂ **Chart Builder**:

In theÂ **Choose fromÂ **menu, click and dragÂ **Scatter/DotÂ **into the main editing window. Then drag the variable **hoursÂ **onto the x-axis andÂ **scoreÂ **onto the y-axis.

Once you clickÂ **OK**, the following scatterplot will appear:

From the plot we can see that there is a positive linear relationship between hours and score. In general, students who study for more hours tend to get higher scores.

Since thereâ€™s a clear linear relationship between the two variables, weâ€™ll proceed to fit a simple linear regression model to the dataset.

**Step 2: Fit a simple linear regression model.**

Click theÂ **AnalyzeÂ **tab, thenÂ **Regression**, thenÂ **Linear**:

In the new window that pops up, drag the variableÂ **scoreÂ **into the box labelled Dependent and dragÂ **hoursÂ **into the box labelled Independent. Then clickÂ **OK**.

**Step 3: Interpret the results.**

Once you clickÂ **OK**, the results of the simple linear regression will appear. The first table weâ€™re interested in is the one titledÂ **Model Summary**:

Here is how to interpret the most relevant numbers in this table:

**R Square:Â**This is the proportion of the variance in the response variable that can be explained by the explanatory variable. In this example,**50.6%**of the variation in exam scores can be explained by hours studied.**Std. Error of the Estimate:Â**The standard error is the average distance that the observed values fall from the regression line. In this example,Â the observed values fall an average of**5.861**units from the regression line.

The next table weâ€™re interested in is titledÂ **Coefficients**:

Here is how to interpret the most relevant numbers in this table:

**Unstandardized B (Constant)**: This tells us the average value of the response variable when the predictor variable is zero. In this example, the average exam score is**73.662Â**when hours studied is equal to zero.**Unstandardized B (hours):Â**This tells us the average change in the response variable associated with a one unit increase in the predictor variable. In this example, each additional hour studied is associated with an increase ofÂ**3.342**in exam score, on average.**Sig (hours):Â**Â This is the p-value associated with the test statistic for hours. In this case, since this value is less than 0.05, we can conclude that the predictor variableÂ**hoursÂ**is statistically significant.Â

Lastly, we can form a regression equation using the values for **constantÂ **andÂ **hours**. In this case, the equation would be:

Estimated exam score =Â 73.662 + 3.342*(hours)

We can use this equation to find the estimated exam score for a student, based on the number of hours they studied. For example, a student that studies for 3 hours is expected to receive an exam score of 83.688:

Estimated exam score =Â 73.662 + 3.342*(3) = 83.688

**Step 4: Report the results.**

Lastly, we want to summarize the results of our simple linear regression. Hereâ€™s an example of how to do so:

A simple linear regression was performed to quantify the relationship between hours studied and exam score received. A sample of 20 students was used in the analysis.

Â

Results showed that there was a statistically significant relationship between hours studied and exam score (t = 4.297, p

Â

The regression equation was found to be:

Â

Estimated exam score =Â 73.662 + 3.342*(hours)

Â

Each additional hour studied is associated with an increase ofÂ

3.342in exam score, on average.