When you create a histogram in R, a formula known as Sturges’ Rule is used to determine the optimal number of bins to use.
However, you can use the following syntax to override this formula and specify an exact number of bins to use in the histogram:
hist(data, breaks = seq(min(data), max(data), length.out = 7))
Note that the number of bins used in the histogram will be one less than the number specified in the length.out argument.
The following examples show how to use this syntax in practice.
Example 1: Create a Basic Histogram
The following code shows how to create a basic histogram in R without specifying the number of bins:
#define vector of data
data
#create histogram of data
hist(data, col = 'lightblue')
Using Sturges’ Rule, R decided to use 8 total bins in the histogram.
Example 2: Specify Number of Bins to Use in Histogram
The following code shows how to create a histogram for the same vector of data and use exactly 6 bins:
#define vector of data data #create histogram with 6 bins hist(data, col = 'lightblue', breaks = seq(min(data), max(data), length.out = 7))
Cautions on Choosing a Specific Number of Bins
The number of bins used in a histogram has a huge impact on how we interpret a dataset.
If we use too few bins, the true underlying pattern in the data can be hidden:
#define vector of data data #create histogram with 3 bins hist(data, col = 'lightblue', breaks = seq(min(data), max(data), length.out = 4))
Conversely, if we use too many bins then we may just be visualizing the noise in a dataset:
#define vector of data data #create histogram with 15 bins hist(data, col = 'lightblue', breaks = seq(min(data), max(data), length.out = 16))
In general, the default Sturges’ Rule used in R tends to produce histograms that have an optimal number of bins.
Feel free to use the code provided here to create a histogram with an exact number of bins, but be careful not to choose too many or too few bins.
Additional Resources
The following tutorials explain how to perform other common functions with histograms in R:
How to Plot Multiple Histograms in R
How to Create a Histogram of Two Variables in R
How to Create a Relative Frequency Histogram in R