Home » How to Extract Numbers from Strings in R (With Examples)

How to Extract Numbers from Strings in R (With Examples)

by Tutor Aspire

You can use the following methods to extract numbers from strings in R:

Method 1: Extract Number from String Using Base R

as.numeric(gsub("\D", "", df$my_column))

Method 2: Extract Number from String Using readr Package

library(readr)

parse_number(df$my_column)

This tutorial explains how to use each method in practice with the following data frame:

#create data frame
df frame(team=c('A', 'A', 'A', 'B', 'B', 'B'),
                 position=c('Guard23', 'Guard14', '2Forward',
                            'Guard25', '6Forward', 'Center99'))

#view data frame
df

  team position
1    A  Guard23
2    A  Guard14
3    A 2Forward
4    B  Guard25
5    B 6Forward
6    B Center99

Example 1: Extract Number from String Using Base R

The following code shows how to extract the numbers from each string in the position column of the data frame:

#extract number from each string in 'position' column
as.numeric(gsub("\D", "", df$position))

[1] 23 14  2 25  6 99

Notice that the numeric values have been extracted from each string in the position column.

Note: The gsub() function simply replaces all non-numbers ( \D ) in a string with a blank space. This has the effect of extracting only the numbers from the string.

If you’d like, you can also store these numeric values in a new column in the data frame:

#create new column that contains numbers from each string in 'position' column
df$num numeric(gsub("\D", "", df$position))

#view updated data frame
df

  team position num
1    A  Guard23  23
2    A  Guard14  14
3    A 2Forward   2
4    B  Guard25  25
5    B 6Forward   6
6    B Center99  99

Example 2: Extract Number from String Using reader Package

The following code shows how to extract the numbers from each string in the position column of the data frame by using the parse_number() function from the readr package:

library(readr)

#extract number from each string in 'position' column
parse_number(df$position)

[1] 23 14  2 25  6 99

Notice that the numeric values have been extracted from each string in the position column.

This matches the results from using the gsub() function in base R.

Additional Resources

The following tutorials explain how to perform other common tasks in R:

How to Select Columns Containing a Specific String in R
How to Remove Characters from String in R
How to Find Location of Character in a String in R

You may also like