*15*

The **droplevels()** function in R can be used to drop unused factor levels.

This function is particularly useful if we want to drop factor levels that are no longer used due to subsetting a vector or a data frame.

This function uses the following syntax:

**droplevels(x)**

whereÂ *x* is an object from which to drop unused factor levels.

This tutorial provides a couple examples of how to use this function in practice.

**Example 1: Drop Unused Factor Levels in a Vector**

Suppose we create a vector of data with five factor levels. Then suppose we define a new vector of data with just three of the original five factor levels.

#define data with 5 factor levels data factor(c(1, 2, 3, 4, 5)) #define new data as original data minus 4th and 5th factor levels new_data #view new data new_data [1] 1 2 3 Levels: 1 2 3 4 5

Although the new data only contains three factors, we can see that it still contains the original five factor levels.

To remove these unused factor levels, we can use the **droplevels()** function:

#drop unused factor levels new_data droplevels(new_data) #view data new_data [1] 1 2 3 Levels: 1 2 3

The new data now contains just three factor levels.

**Example 2: Drop Unused Factor Levels in a Data Frame**

Suppose we create a data frame in which one of the variables is a factor with five levels. Then suppose we define a new data frame that happens to remove two of these factor levels:

#create data frame df frame(region=factor(c('A', 'B', 'C', 'D', 'E')), sales = c(13, 16, 22, 27, 34)) #view data frame df region sales 1 A 13 2 B 16 3 C 22 4 D 27 5 E 34 #define new data frame new_df subset(df, sales #view new data frame new_df region sales 1 A 13 2 B 16 3 C 22 #check levels of region variable levels(new_df$region) [1] "A" "B" "C" "D" "E"

Although the new data frame contains only three factors in the *region* column, it still contains the original five factor levels. This would create some problems if we tried to create any plots using this data.

To remove the unused factor levels from the *region* variable, we can use the **droplevels()** function:

#drop unused factor levels new_df$region droplevels(new_df$region) #check levels of region variable levels(new_df$region) [1] "A" "B" "C"

Now theÂ *region* variable only contains three factor levels.

You can find more R tutorials on this page.