One way to quantify the relationship between two variables is to use the Pearson correlation coefficient, which is a measure of the linear association between two variables.
It takes on a value between -1 and 1 where:
- -1 indicates a perfectly negative linear correlation.
- 0 indicates no linear correlation.
- 1 indicates a perfectly positive linear correlation.
The further away the correlation coefficient is from zero, the stronger the relationship between the two variables.
But in some cases we want to understand the correlation between more than just one pair of variables.
In these cases, we can create a correlation matrix, which is a square table that shows the the correlation coefficients between several pairwise combination of variables.
This tutorial explains how to create and interpret a correlation matrix in Matlab.
How to Create a Correlation Matrix in Matlab
Use the following steps to create a correlation matrix in Matlab.
Step 1: Create the dataset.
rng(0); A = randn(10,1); B = randn(10,1); C = randn(10,1); all = [A B C];
Step 2: Create the correlation matrix.
R = corrcoef(all)
R =
1.0000 0.4518 -0.5003
0.4518 1.0000 -0.8017
-0.5003 -0.8017 1.0000
Step 3: Interpret the correlation matrix.
The correlation coefficients along the diagonal of the table are all equal to 1 because each variable is perfectly correlated with itself.
All of the other correlation coefficients indicate the correlation between different pairwise combinations of variables. For example:
- The correlation coefficient between ‘a’ and ‘b’ is 0.4518.
- The correlation coefficient between ‘a’ and ‘c’ is -0.5003.
- The correlation coefficient between ‘b’ and ‘c’ is -0.8017.
Step 4: Find the p-values of the correlation coefficients.
[R,P] = corrcoef(all)
R =
1.0000 0.4518 -0.5003
0.4518 1.0000 -0.8017
-0.5003 -0.8017 1.0000
P =
1.0000 0.1899 0.1408
0.1899 1.0000 0.0053
0.1408 0.0053 1.0000
The way to interpret the p-values is as follows:
- The p-value for the correlation coefficient between ‘a’ and ‘b’ is 0.1899.
- The p-value for the correlation coefficient between ‘a’ and ‘c’ is 0.1408.
- The p-value for the correlation coefficient between ‘b’ and ‘c’ is 0.0053.
If the p-value is less than some significance level (e.g. 0.05) then we can say that the correlation between the two variables is statistically significant.
In this case, the correlation between variables ‘b’ and ‘c’ is the only statistically significant correlation.
Additional Resources
The following tutorials provide additional information about correlation matrices:
Matlab Documentation for the corrcoef() function
How to Read a Correlation Matrix
Correlation Matrix Calculator