You can use one of the following methods to select the top N values by group in R:
Method 1: Select Top N Values by Group (Ignore Ties)
library(dplyr) #select top 5 values by group df %>% arrange(desc(values_column)) %>% group_by(group_column) %>% slice(1:5)
Method 2: Select Top N Values by Group (Include Ties)
library(dplyr) #select top 5 values by group df %>% group_by(group_column) %>% top_n(5, values_column)
The following examples show how to use each method with the following data frame in R:
#create data frame
df frame(team=c('A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'),
points=c(19, 22, 15, NA, 14, 25, 25, 25),
rebounds=c(10, 6, 3, 7, 11, 13, 9, 12))
#view data frame
df
team points rebounds
1 A 19 10
2 A 22 6
3 A 15 3
4 A NA 7
5 B 14 11
6 B 25 13
7 B 25 9
8 B 25 12
Example 1: Select Top N Values by Group (Ignore Ties)
The following code shows how to select the top 2 rows with the highest points values, grouped by team:
library(dplyr) #select top 2 rows with highest points values, grouped by team df %>% arrange(desc(points)) %>% group_by(team) %>% slice(1:2) # A tibble: 4 x 3 # Groups: team [2] team points rebounds 1 A 22 6 2 A 19 10 3 B 25 13 4 B 25 9
The output contains the two rows with the highest points values for each team.
Note that for team B, there were actually three rows that were tied for highest points value (25) but only two rows are returned in the output.
This method simply ignores ties.
Example 2: Select Top N Values by Group (Include Ties)
The following code shows how to select the top 2 rows with the highest points values, grouped by team:
library(dplyr) #select top 2 rows with highest points values, grouped by team df %>% group_by(team) %>% top_n(2, points) # A tibble: 5 x 3 # Groups: team [2] team points rebounds 1 A 19 10 2 A 22 6 3 B 25 13 4 B 25 9 5 B 25 12
The output contains the two rows with the highest points values for each team.
Note that for team B, there were three rows that were tied for highest points value (25) so this method included all three of those rows in the final output.
Additional Resources
The following tutorials explain how to perform other common operations in R:
How to Select Rows Where Value Appears in Any Column in R
How to Select Specific Columns in R
How to Select Columns by Index in R