Home » How to Perform an F-Test in Python

How to Perform an F-Test in Python

by Tutor Aspire

An F-test is used to test whether two population variances are equal. The null and alternative hypotheses for the test are as follows:

H0: σ12 = σ22 (the population variances are equal)

H1: σ12 ≠ σ22 (the population variances are not equal)

This tutorial explains how to perform an F-test in Python.

Example: F-Test in Python

Suppose we have the following two samples:

x = [18, 19, 22, 25, 27, 28, 41, 45, 51, 55]
y = [14, 15, 15, 17, 18, 22, 25, 25, 27, 34]

We can use the following function to perform an F-test to determine if the two populations these samples came from have equal variances:

import numpy as np

#define F-test function
def f_test(x, y):
    x = np.array(x)
    y = np.array(y)
    f = np.var(x, ddof=1)/np.var(y, ddof=1) #calculate F test statistic 
    dfn = x.size-1 #define degrees of freedom numerator 
    dfd = y.size-1 #define degrees of freedom denominator 
    p = 1-scipy.stats.f.cdf(f, dfn, dfd) #find p-value of F test statistic 
    return f, p

#perform F-test
f_test(x, y)

(4.38712, 0.019127)

The F test statistic is 4.38712 and the corresponding p-value is 0.019127. Since this p-value is less than .05, we would reject the null hypothesis. This means we have sufficient evidence to say that the two population variances are not equal.

Notes

  • The F test statistic is calculated as s12 / s22. By default, numpy.var calculates the population variance. To calculate the sample variance, we need to specify ddof=1.
  • The p-value corresponds to 1 – cdf of the F distribution with numerator degrees of freedom = n1-1 and denominator degrees of freedom = n2-1.
  • This function only works when the first sample variance is larger than the second sample variance. Thus, define the two samples in such a way that they work with the function.

When to Use the F-Test

The F-test is typically used to answer one of the following questions:

1. Do two samples come from populations with equal variances?

2. Does a new treatment or process reduce the variability of some current treatment or process?

Related: How to Perform an F-Test in R

You may also like