MySQL Standard Deviation
The standard deviation is statistics that measures the number of variations or dispersion of a set of values relative to the mean. The standard deviation is higher when the data points are far from the mean. Thus, the more spread out the data, the greater its standard deviation. And when the standard deviation is lower, it means the values are closer to the mean. It also measures how much variations exist from the mean.
A standard deviation can also be defined as a square root of the variance. We can calculate the standard deviation using the following points:
- Find the mean by calculating the average of all values in the data set. For example, if we have a data set containing the values 2, 4 and 6, the mean is (2+4+6)/3 = 4.
- Next, for each value, compute the difference from the mean and then square each result. For example, (2-4)2= (-2)2 = 4, (4-4)2 = (0)2 = 0, (6-4)2 = (2)2 = 4.
- Next, calculate the mean from the values found in step 2, which gives the variance. For example, ((4+0+4)/3) = 5.33333333.
- Finally, take the square root of the variance that produces the standard deviation in the data set. For example, the square root of (5.33333333) = 2.3094010768.
Sample Standard Deviation vs. Population Standard Deviation
A standard deviation is called sample standard deviation when only a few or a subset of values is used in the calculation.
A standard deviation is called population standard deviation when all data set values are used in the calculation.
The standard deviation is represented by using the sigma letter (σ) symbol. See the below equations that explain how we can calculate sample standard deviation and population standard deviation:
The equation for simple standard deviation:
The equation for population standard deviation:
In the above equations, the N is the number of values available in the data set.
MySQL Standard Deviation Functions
MySQL provides a function that allows us to calculate the sample standard deviation and population standard deviation in an easy and quick way. MySQL supports these functions from versions 5.6 and later.
If we want to calculate the sample standard deviation, we can use the below function:
STDDEV_SAMP() Function
This function is used to return the sample standard deviation of an expression (the square root of VAR_SAMP()). This function returns NULL if there are no matching rows found.
We can use this function as follows:
Example:
Suppose we have a table named “employee” that contains the following data:
The below statement calculates the sample standard deviation of the salaries in the employee table:
Output:
We will get the below result:
If we want to calculate the population standard deviation, we can use the below functions:
STD() Function
This function is used to returns the population standard deviation of the given expression. It is a synonym of the STDDEV_POP() function. The STD() function returns NULL if there are no matching rows found.
We can use this function as follows:
Example:
The below statement calculates the population standard deviation of the salaries in the employee table:
Output:
We will get the below result:
STDDEV() Function
This function is used to returns the population standard deviation by calculating the statistical information of a given expression. The STDDEV() function returns NULL if there are no matching rows found.
We can use this function as follows:
Example:
The below statement calculates the sample standard deviation of the salaries in the employee table:
Output:
We will get the below result:
STDDEV_POP() Function
This function is used to returns the population standard deviation of a given expression (the square root of VAR_POP()). The STDDEV_POP() function returns NULL if there are no matching rows found.
We can use this function as follows:
Example:
The below statement calculates the sample standard deviation of the salaries in the employee table:
Output:
We will get the below result: