Home » How to Plot a Regression Line by Group with ggplot2

How to Plot a Regression Line by Group with ggplot2

by Tutor Aspire

We can use the following syntax to plot a regression line by group using the R visualization package ggplot2:

ggplot(df, aes(x = x_variable, y = y_variable, color = group_variable)) +
  geom_point() +
  geom_smooth(method = "lm", fill = NA)

This tutorial provides a quick example of how to use this function in practice.

Example: Plot Regression Lines by Group with ggplot2

Suppose we have the following dataset that shows the following three variables for 15 different students:

  • Number of hours studied
  • Exam score received
  • Study technique used (either A, B, or C)
#create dataset
df rep(c('A', 'B', 'C'), each=5))

#view dataset
df

   hours score technique
1      1    84         A
2      2    86         A
3      3    85         A
4      3    87         A
5      4    94         A
6      1    74         B
7      2    76         B
8      2    75         B
9      3    77         B
10     4    79         B
11     1    65         C
12     2    67         C
13     3    69         C
14     4    72         C
15     4    80         C

The following code shows how to plot a regression line that captures the relationship between hours studied and exam score received for each of the three study techniques:

#load ggplot2
library(ggplot2)

#create regression lines for all three groups
ggplot(df, aes(x = hours, y = score, color = technique)) +
  geom_point() +
  geom_smooth(method = "lm", fill = NA)

Regression line by group in ggplot2

Note that in geom_smooth() we used method = ‘lm” to specify a linear trend.

We could also use other smoothing methods like “glm”, “loess”, or “gam” to capture nonlinear trends in the data. You can find the full documentation for geom_smooth() here.

Note that we could also use different shapes to display the exam scores for each of the three groups:

ggplot(df, aes(x = hours, y = score, color = technique, shape = technique)) +
  geom_point() +
  geom_smooth(method = "lm", fill = NA)

Multiple regression lines in one plot in ggplot2

You can find more ggplot2 tutorials here.

You may also like