You can use the fillna() function to replace NaN values in a pandas DataFrame.
Here are three common ways to use this function:
Method 1: Fill NaN Values in One Column with Mean
df['col1'] = df['col1'].fillna(df['col1'].mean())
Method 2: Fill NaN Values in Multiple Columns with Mean
df[['col1', 'col2']] = df[['col1', 'col2']].fillna(df[['col1', 'col2']].mean())
Method 3: Fill NaN Values in All Columns with Mean
df = df.fillna(df.mean())
The following examples show how to use each method in practice with the following pandas DataFrame:
import numpy as np import pandas as pd #create DataFrame with some NaN values df = pd.DataFrame({'rating': [np.nan, 85, np.nan, 88, 94, 90, 76, 75, 87, 86], 'points': [25, np.nan, 14, 16, 27, 20, 12, 15, 14, 19], 'assists': [5, 7, 7, np.nan, 5, 7, 6, 9, 9, 5], 'rebounds': [11, 8, 10, 6, 6, 9, 6, 10, 10, 7]}) #view DataFrame df rating points assists rebounds 0 NaN 25.0 5.0 11 1 85.0 NaN 7.0 8 2 NaN 14.0 7.0 10 3 88.0 16.0 NaN 6 4 94.0 27.0 5.0 6 5 90.0 20.0 7.0 9 6 76.0 12.0 6.0 6 7 75.0 15.0 9.0 10 8 87.0 14.0 9.0 10 9 86.0 19.0 5.0 7
Example 1: Fill NaN Values in One Column with Mean
The following code shows how to fill the NaN values in the rating column with the mean value of the rating column:
#fill NaNs with column mean in 'rating' column df['rating'] = df['rating'].fillna(df['rating'].mean()) #view updated DataFrame df rating points assists rebounds 0 85.125 25.0 5.0 11 1 85.000 NaN 7.0 8 2 85.125 14.0 7.0 10 3 88.000 16.0 NaN 6 4 94.000 27.0 5.0 6 5 90.000 20.0 7.0 9 6 76.000 12.0 6.0 6 7 75.000 15.0 9.0 10 8 87.000 14.0 9.0 10 9 86.000 19.0 5.0 7
The mean value in the rating column was 85.125 so each of the NaN values in the rating column were filled with this value.
Example 2: Fill NaN Values in Multiple Columns with Mean
The following code shows how to fill the NaN values in both the rating and points columns with their respective column means:
#fill NaNs with column means in 'rating' and 'points' columns df[['rating', 'points']] = df[['rating', 'points']].fillna(df[['rating', 'points']].mean()) #view updated DataFrame df rating points assists rebounds 0 85.125 25.0 5.0 11 1 85.000 18.0 7.0 8 2 85.125 14.0 7.0 10 3 88.000 16.0 NaN 6 4 94.000 27.0 5.0 6 5 90.000 20.0 7.0 9 6 76.000 12.0 6.0 6 7 75.000 15.0 9.0 10 8 87.000 14.0 9.0 10 9 86.000 19.0 5.0 7
The NaN values in both the ratings and points columns were filled with their respective column means.
Example 3: Fill NaN Values in All Columns with Mean
The following code shows how to fill the NaN values in each column with the column means:
#fill NaNs with column means in each column df = df.fillna(df.mean()) #view updated DataFrame df rating points assists rebounds 0 85.125 25.0 5.000000 11 1 85.000 18.0 7.000000 8 2 85.125 14.0 7.000000 10 3 88.000 16.0 6.666667 6 4 94.000 27.0 5.000000 6 5 90.000 20.0 7.000000 9 6 76.000 12.0 6.000000 6 7 75.000 15.0 9.000000 10 8 87.000 14.0 9.000000 10 9 86.000 19.0 5.000000 7
Notice that the NaN values in each column were filled with their column mean.
You can find the complete online documentation for the fillna() function here.
Additional Resources
The following tutorials explain how to perform other common operations in pandas:
How to Count Missing Values in Pandas
How to Drop Rows with NaN Values in Pandas
How to Drop Rows that Contain a Specific Value in Pandas