*22*

TheÂ **Empirical Rule**, sometimes called the 68-95-99.7 rule, states that for a given dataset with a normal distribution:

**68%**Â of data values fall within one standard deviation of the mean.**95%**Â of data values fall within two standard deviations of the mean.**99.7%**Â of data values fall within three standard deviations of the mean.

In this tutorial, we explain how to apply the Empirical Rule in R to a given dataset.

**Applying the Empirical Rule in R**

The **pnorm()** function in R returns the value of the cumulative density function of the normal distribution.

This function uses the following basic syntax:

**pnorm(q, mean, sd)**

where:

**q**: normally distributed random variable value**mean**: mean of distribution**sd**: standard deviation of distribution

We can use the following syntax to find the area under the normal distribution curve that lies in between various standard deviations:

#find area under normal curve within 1 standard deviation of mean pnorm(1) - pnorm(-1) [1] 0.6826895 #find area under normal curve within 2 standard deviations of mean pnorm(2) - pnorm(-2) [1] 0.9544997 #find area under normal curve within 3 standard deviations of mean pnorm(3) - pnorm(-3) [1] 0.9973002

From the output we can confirm:

**68%**Â of data values fall within one standard deviation of the mean.**95%**Â of data values fall within two standard deviations of the mean.**99.7%**Â of data values fall within three standard deviations of the mean.

The following examples show how to use the Empirical Rule with different datasets in practice.

**Example 1: Applying the Empirical Rule to a Dataset in R**

Suppose we have a normally distributed dataset with a mean of **7** and a standard deviation of **2.2**.

We can use the following code to find which values contain 68%, 95%, and 99.7% of the data:

#define mean and standard deviation values mean=7 sd=2.2 #find which values contain 68% of data mean-2.2; mean+2.2 [1] 4.8 [1] 9.2 #find which values contain 95% of data mean-2*2.2; mean+2*2.2 [1] 2.6 [1] 11.4 #find which values contain 99.7% of data mean-3*2.2; mean+3*2.2 [1] 0.4 [1] 13.6

From this output, we can see:

- 68% of the data falls betweenÂ
**4.8Â**andÂ**9.2** - 95% of the data falls betweenÂ
**2.6Â**andÂ**11.4** - 99.7% of the data falls betweenÂ
**0.4Â**andÂ**13.6**

**Example 2: Finding What Percentage of Data Falls Between Certain Values**

Imagine we have a normally distributed dataset with a mean of 100 and standard deviation of 5.

Suppose we want to know what percentage of the data falls between the values **99** and **105** in this distribution.

We can use the **pnorm(**) function to find the answer:

#find area under normal curve between 99 and 105 pnorm(105, mean=100, sd=5) - pnorm(99, mean=100, sd=5) [1] 0.4206045

We see that **42.06% **of the data falls between the values 99 and 105 for this distribution.

**Additional Resources**

How to Apply the Empirical Rule in Excel

Empirical Rule Practice Problems

Empirical Rule Calculator