Tetrachoric correlation is a measure of the correlation between two binary variables – that is, variables that can only take on two values like “yes” and “no” or “good” and “bad.”
This type of correlation is often used in surveys and personality tests in which the questions being asked only have two possible response values.
The value for a tetrachoric correlation can range from -1 to 1 where:
- -1 indicates a strong negative correlation between the two variables.
- 0 indicates no correlation between the two variables.
- 1 indicates a strong positive correlation between the two variables.
Note: For this correlation to be reliable, it’s assumed that both variables come from a normal distribution.
How to Calculate Tetrachoric Correlation
Suppose we have the following 2×2 table with two variables, x and y, that both take on two values:
The formula to calculate the tetrachoric correlation between the two variables in this table is:
Tetrachoric correlation = COS(π/(1+√(ad/b/c)))
where:
- COS represents the cosine function
- π represents the numerical value Pi, equal to 3.141592…
- a, b, c, d represent the numerical values in the cells of the 2×2 table
Note: If you have two ordinal variables (that can take on more than just two values) then you can instead calculate the polychoric correlation.
Example: Calculating Tetrachoric Correlation
Suppose we want to know whether or not gender is associated with political party preference so we take a simple random sample of 100 voters and survey them on their political party preference.
The following table shows the results of the survey:
We would calculate the tetrachoric correlation between these two variables as:
Tetrachoric correlation = COS(π/(1+√(19*39/30/12))) = 0.277.
This correlation is fairly low, which indicates that there is a weak association between gender and political party preference.
Additional Resources
An Introduction to the Pearson Correlation Coefficient
An Introduction to Kendall’s Tau