The where() function can be used to replace certain values in a pandas DataFrame.
This function uses the following basic syntax:
df.where(cond, other=nan)
For every value in a pandas DataFrame where cond is True, the original value is retained.
For every value where cond is False, the original value is replaced by the value specified by the other argument.
The following examples show how to use this syntax in practice with the following pandas DataFrame:
import pandas as pd #define DataFrame df = pd.DataFrame({'points': [25, 12, 15, 14, 19, 23, 25, 29], 'assists': [5, 7, 7, 9, 12, 9, 9, 4], 'rebounds': [11, 8, 10, 6, 6, 5, 9, 12]}) #view DataFrame df points assists rebounds 0 25 5 11 1 12 7 8 2 15 7 10 3 14 9 6 4 19 12 6 5 23 9 5 6 25 9 9 7 29 4 12
Example 1: Replace Values in Entire DataFrame
The following code shows how to use the where() function to replace all values that don’t meet a certain condition in an entire pandas DataFrame with a NaN value.
#keep values that are greater than 7, but replace all others with NaN df.where(df>7) points assists rebounds 0 25 NaN 11.0 1 12 NaN 8.0 2 15 NaN 10.0 3 14 9.0 NaN 4 19 12.0 NaN 5 23 9.0 NaN 6 25 9.0 9.0 7 29 NaN 12.0
We can also use the other argument to replace values with something other than NaN.
#keep values that are greater than 7, but replace all others with 'low' df.where(df>7, other='low') points assists rebounds 0 25 low 11 1 12 low 8 2 15 low 10 3 14 9 low 4 19 12 low 5 23 9 low 6 25 9 9 7 29 low 12
Example 2: Replace Values in Specific Column of DataFrame
The following code shows how to use the where() function to replace all values that don’t meet a certain condition in a specific column of a DataFrame.
#keep values greater than 15 in 'points' column, but replace others with 'low' df['points'] = df['points'].where(df['points']>15, other='low') #view DataFrame df points assists rebounds 0 25 5 11 1 low 7 8 2 low 7 10 3 low 9 6 4 19 12 6 5 23 9 5 6 25 9 9 7 29 4 12
You can find the complete online documentation for the pandas where() function here.
Additional Resources
The following tutorials explain how to use other common functions in pandas:
How to Use describe() Function in Pandas
How to Use idxmax() Function in Pandas
How to Apply a Function to Selected Columns in Pandas