To winsorize data means to set extreme outliers equal to a specified percentile of the data.
For example, a 90% winsorization sets all observations greater than the 95th percentile equal to the value at the 95th percentile and all observations less than the 5th percentile equal to the value at the 5th percentile.
This tutorial provides a step-by-step example of how to winsorize a dataset in Excel.
Step 1: Create the Data
First, we’ll create the following dataset:
Step 2: Calculate the Upper and Lower Percentiles
For this example, we’ll perform a 90% winsorization. This means we’ll set all values greater than the 95th percentile equal to the 95th percentile and all values less than the 5th percentile equal to the 5th percentile.
The following formulas show how to find the 5th and 95th percentiles:
The 5th percentile turns out to be 12.35 and the 95th percentile turns out to be 92.05.
Step 3: Winsorize the Data
Lastly, we’ll use the following formula to winsorize the data:
Note that we just copy and pasted the formula in cell F2 down to the remaining cells in column F.
In this case, the value 3 became changed to 12.35 and the value 98 became changed to 92.05.
Note that in this example we performed a 90% winsorization, but it’s possible to also perform an 80% winsorization, 95% winsorization, 99% winsorization, etc. by simply calculating different upper and lower percentiles.