In statistics, deciles are numbers that split a dataset into ten groups of equal frequency.
The first decile is the point where 10% of all data values lie below it. The second decile is the point where 20% of all data values lie below it, and so on.
We can use the following syntax to calculate the deciles for a dataset in R:
quantile(data, probs = seq(.1, .9, by = .1))
The following example shows how to use this function in practice.
Example: Calculate Deciles in R
The following code shows how to create a fake dataset with 20 values and then calculate the values for the deciles of the dataset:
#create dataset data #calculate deciles of dataset quantile(data, probs = seq(.1, .9, by = .1)) 10% 20% 30% 40% 50% 60% 70% 80% 90% 63.4 67.8 76.5 83.6 88.5 90.4 92.3 93.2 95.2
The way to interpret the deciles is as follows:
- 10% of all data values lie below 63.4
- 20% of all data values lie below 67.8.
- 30% of all data values lie below 76.5.
- 40% of all data values lie below 83.6.
- 50% of all data values lie below 88.5.
- 60% of all data values lie below 90.4.
- 70% of all data values lie below 92.3.
- 80% of all data values lie below 93.2.
- 90% of all data values lie below 95.2.
It’s worth noting that the value at the 50th percentile is equal to the median value of the dataset.
Example: Place Values into Deciles in R
To place each data value into a decile, we can use the ntile(x, ngroups) function from the dplyr package in R.
Here’s how to use this function for the dataset we created in the previous example:
library(dplyr) #create dataset data #place each value into a decile data$decile #view data data values decile 1 56 1 2 58 1 3 64 2 4 67 2 5 68 3 6 73 3 7 78 4 8 83 4 9 84 5 10 88 5 11 89 6 12 90 6 13 91 7 14 92 7 15 93 8 16 93 8 17 94 9 18 95 9 19 97 10 20 99 10
The way to interpret the output is as follows:
- The data value 56 falls between the percentile 0% and 10%, thus it falls in the first decile.
- The data value 58 falls between the percentile 0% and 10%, thus it falls in the first decile.
- The data value 64 falls between the percentile 10% and 20%, thus it falls in the second decile.
- The data value 67 falls between the percentile 10% and 20%, thus it falls in the second decile.
- The data value 68 falls between the percentile 20% and 30%, thus it falls in the third decile.
And so on.
Additional Resources
How to Calculate Percentiles in R
How to Calculate Quartiles in R
How to Create Frequency Tables in R