Polychoric correlation is used to calculate the correlation between ordinal variables.
Recall that ordinal variables are variables whose possible values are categorical and have a natural order.
Some examples of variables measured on an ordinal scale include:
- Satisfaction: Very unsatisfied, unsatisfied, neutral, satisfied, very satisfied
- Income level: Low income, medium income, high income
- Workplace status: Entry Analyst, Analyst I, Analyst II, Lead Analyst
- Degree of pain: Small amount, medium amount, high amount
The value for polychoric correlation ranges from -1 to 1 where:
- -1 indicates a perfect negative correlation
- 0 indicates no correlation
- 1 indicates a perfect positive correlation
We can use the polychor(x, y) function from the polycor package to calculate the polychoric correlation between two ordinal variables in R.
The following examples show how to use this function in practice.
Example 1: Calculate Polychoric Correlation for Movie Ratings
Suppose want to know whether or not two different movie ratings agencies have a high correlation between their movie ratings.
We ask each agency to rate 20 different movies on a scale of 1 to 3 where:
- 1 indicates “bad”
- 2 indicates “mediocre”
- 3 indicates “good”
We can use the following code in R to calculate the polychoric correlation between the ratings of the two agencies:
library(polycor) #define movie ratings for each agency agency1 #calculate polychoric correlation between ratings polychor(agency1, agency2) [1] 0.7828328
The polychoric correlation turns out to be 0.78.
This value is quite high, which indicates that there is a strong positive association between the ratings from each agency.
Example 2: Calculate Polychoric Correlation for Restaurant Ratings
Suppose want to know whether or not two different neighborhood restaurants have any correlation between their restaurant ratings from customers.
We randomly survey 20 customers who ate at both restaurants and ask them to rate their overall satisfaction a scale of 1 to 5 where:
- 1 indicates “very unsatisfied”
- 2 indicates “unsatisfied”
- 3 indicates “neutral”
- 4 indicates “satisfied”
- 5 indicates “very satisfied”
We can use the following code in R to calculate the polychoric correlation between the ratings of the two restaurants:
library(polycor) #define ratings for each restaurant restaurant1 #calculate polychoric correlation between ratings polychor(restaurant1, restaurant2) [1] -0.1322774
The polychoric correlation turns out to be -0.13.
This value is close to zero, which indicates that there is very little (if any) association between the ratings of the restaurants.
Additional Resources
The following tutorials explain how to calculate other common correlation coefficients in R:
How to Calculate Spearman Rank Correlation in R
How to Calculate Point-Biserial Correlation in R
How to Calculate Cross Correlation in R
How to Calculate Rolling Correlation in R
How to Calculate Partial Correlation in R