You can use the following methods to use LIKE (similar to SQL) inside a pandas query() function to find rows that contain a particular pattern:
Method 1: Find Rows that Contain One Pattern
df.query('my_column.str.contains("pattern1")')
Method 2: Find Rows that Contain One of Several Patterns
df.query('my_column.str.contains("pattern1|pattern2")')
The following examples show how to use each method in practice with the following pandas DataFrame:
import pandas as pd #create DataFrame df = pd.DataFrame({'team': ['Cavs', 'Heat', 'Mavs', 'Mavs', 'Nets', 'Heat', 'Cavs', 'Jazz', 'Jazz', 'Hawks'], 'points': [3, 3, 4, 5, 4, 7, 8, 7, 12, 14], 'rebounds': [15, 14, 14, 10, 8, 14, 13, 9, 5, 4]}) #view DataFrame print(df) team points rebounds 0 Cavs 3 15 1 Heat 3 14 2 Mavs 4 14 3 Mavs 5 10 4 Nets 4 8 5 Heat 7 14 6 Cavs 8 13 7 Jazz 7 9 8 Jazz 12 5 9 Hawks 14 4
Example 1: Find Rows that Contain One Pattern
The following code shows how to use the query() function to find all rows in the DataFrame that contain “avs” in the team column:
df.query('team.str.contains("avs")') team points rebounds 0 Cavs 3 15 2 Mavs 4 14 3 Mavs 5 10 6 Cavs 8 13
Each row that is returned contains “avs” somewhere in the team column.
Also note that this syntax is case-sensitive.
Thus, if we used “AVS” instead then we would not receive any results because no row contains uppercase “AVS” in the team column.
Example 2: Find Rows that Contain One of Several Patterns
The following code shows how to use the query() function to find all rows in the DataFrame that contain “avs” or “eat” in the team column:
df.query('team.str.contains("avs|eat")') team points rebounds 0 Cavs 3 15 1 Heat 3 14 2 Mavs 4 14 3 Mavs 5 10 5 Heat 7 14 6 Cavs 8 13
Each row that is returned contains either “avs” or “eat” somewhere in the team column.
Note: The | operator stands for “or” in pandas. Feel free to use as many as these operators as you’d like to search for even more string patterns.
Additional Resources
The following tutorials explain how to perform other common tasks in pandas:
Pandas: How to Filter Rows Based on String Length
Pandas: How to Drop Rows Based on Condition
Pandas: How to Use “NOT IN” Filter