You can use the following syntax to select specific columns in a data frame in base R:
#select columns by name df[c('col1', 'col2', 'col4')] #select columns by index df[c(1, 2, 4)]
Alternatively, you can use the select() function from the dplyr package:
library(dplyr) #select columns by name df %>% select(col1, col2, col4) #select columns by index df %>% select(1, 2, 4)
For extremely large datasets, it’s recommended to use the dplyr method since the select() function tends to be quicker than functions in base R.
The following examples show how to use both of these methods in practice with the following data frame:
#create data frame df frame(a=c(1, 3, 4, 6, 8, 9), b=c(7, 8, 8, 7, 13, 16), c=c(11, 13, 13, 18, 19, 22), d=c(12, 16, 18, 22, 29, 38)) #view data frame df a b c d 1 1 7 11 12 2 3 8 13 16 3 4 8 13 18 4 6 7 18 22 5 8 13 19 29 6 9 16 22 38
Example 1: Select Specific Columns Using Base R (by name)
The following code shows how to select specific columns by name using base R:
#select columns by name df[c('a', 'b', 'd')] a b d 1 1 7 12 2 3 8 16 3 4 8 18 4 6 7 22 5 8 13 29 6 9 16 38
Example 2: Select Specific Columns Using Base R (by index)
The following code shows how to select specific columns by index using base R:
#select columns by index
df[c(1, 2, 4)]
a b d
1 1 7 12
2 3 8 16
3 4 8 18
4 6 7 22
5 8 13 29
6 9 16 38
Example 3: Select Specific Columns Using dplyr (by name)
The following code shows how to select specific columns by name using dplyr:
library(dplyr)
#select columns by name
df %>%
select(a, b, d)
a b d
1 1 7 12
2 3 8 16
3 4 8 18
4 6 7 22
5 8 13 29
6 9 16 38
Example 4: Select Specific Columns Using dplyr (by index)
The following code shows how to select specific columns by index using dplyr:
library(dplyr)
#select columns by index
df %>%
select(1, 2, 4)
a b d
1 1 7 12
2 3 8 16
3 4 8 18
4 6 7 22
5 8 13 29
6 9 16 38
Additional Resources
How to Add a Column to a Data Frame in R
How to Loop Through Column Names in R
How to Sort a Data Frame by Column in R