Home » How to Join Multiple Data Frames Using dplyr

How to Join Multiple Data Frames Using dplyr

by Tutor Aspire

Often you may be interested in joining multiple data frames in R. Fortunately this is easy to do using the left_join() function from the dplyr package.

library(dplyr)

For example, suppose we have the following three data frames:

#create data frame
df1 

To join all three data frames together, we can simply perform two left joins, one after the other:

#join the three data frames
df1 %>%
    left_join(df2, by='a') %>%
    left_join(df3, by='a')

   a  b  c  d
1  a 12 23 NA
2  a 12 24 NA
3  a 12 33 NA
4  b 14 34 NA
5  b 14 37 NA
6  b 14 41 NA
7  c 14 NA NA
8  d 18 NA 23
9  e 22 NA 24
10 f 23 NA 33

Note that you can also save the result of this join as a data frame:

#join the three data frames and save result as new data frame named all_data
all_data %
              left_join(df2, by='a') %>%
              left_join(df3, by='a')

#view summary of resulting data frame
glimpse(all_data)

Observations: 10
Variables: 4
$ a  "a", "a", "a", "b", "b", "b", "c", "d", "e", "f"
$ b  12, 12, 12, 14, 14, 14, 14, 18, 22, 23
$ c  23, 24, 33, 34, 37, 41, NA, NA, NA, NA
$ d  NA, NA, NA, NA, NA, NA, NA, 23, 24, 33

Additional Resources

How to Filter Rows in R
How to Remove Duplicate Rows in R
How to Group & Summarize Data in R

You may also like