59
The simplest way to filter a pandas DataFrame by column values is to use the query function.
This tutorial provides several examples of how to use this function in practice with the following pandas DataFrame:
import pandas as pd #create DataFrame df = pd.DataFrame({'team': ['A', 'A', 'B', 'B', 'C'], 'points': [25, 12, 15, 14, 19], 'assists': [5, 7, 7, 9, 12], 'rebounds': [11, 8, 10, 6, 6]}) #view DataFrame df team points assists rebounds 0 A 25 5 11 1 A 12 7 8 2 B 15 7 10 3 B 14 9 6 4 C 19 12 6
Example 1: Filter Based on One Column
The following code shows how to filter the rows of the DataFrame based on a single value in the “points” column:
df.query('points == 15') team points assists rebounds 2 B 15 7 10
Example 2: Filter Based on Multiple Columns
The following code shows how to filter the rows of the DataFrame based on several values in different columns:
#return rows where points is equal to 15 or 14 df.query('points == 15 | points == 14') team points assists rebounds 2 B 15 7 10 3 B 14 9 6 #return rows where points is greater than 13 and rebounds is greater than 6 df.query('points > 13 & points > 6') team points assists rebounds 0 A 25 5 11 2 B 15 7 10
Example 3: Filter Based on Values in a List
The following code shows how to filter the rows of the DataFrame based on values in a list
#define list of values value_list = [12, 19, 25] #return rows where points is in the list of values df.query('points in @value_list') team points assists rebounds 0 A 25 5 11 1 A 12 7 8 4 C 19 12 6 #return rows where points is not in the list of values df.query('points not in @value_list') team points assists rebounds 2 B 15 7 10 3 B 14 9 6
Additional Resources
How to Replace Values in Pandas
How to Drop Rows with NaN Values in Pandas
How to Drop Duplicate Rows in Pandas