Correlation and Variance-Covariance Matrices¶

Variance-covariance and correlation matrices are among the most important quantitative measures of a data set that characterize statistical relationships involving dependence.

Specifically, the covariance measures the extent to which variables “fluctuate together” (that is, co-vary). The correlation is the covariance normalized to be between -1 and +1. A positive correlation indicates the extent to which variables increase or decrease simultaneously. A negative correlation indicates the extent to which one variable increases while the other one decreases. Values close to +1 and -1 indicate a high degree of linear dependence between variables.

Details¶

Given a set $$X$$ of $$n$$ feature vectors $$x_1 = (x_{11}, \ldots, x_{1p}), \ldots, x_n = (x_{n1}, \ldots, x_{np})$$ of dimension $$p$$, the problem is to compute the sample means and variance-covariance matrix or correlation matrix:

Correlation and Variance-Covariance Matrices

Statistic

Definition

Means

$$M = (m(1), \ldots , m(p))$$, where $$m\left(j\right)=\frac{1}{n}\sum _{i}{x}_{ij}$$

Variance-covariance matrix

$$Cov = (v_{ij})$$, where $$v_{ij}=\frac{1}{n-1}\sum_{k=1}^{n}(x_{ki}-m(i))(x_{kj}-m(j))$$, $$i=\overline{1,p}$$, $$j=\overline{1,p}$$

Correlation matrix

$$Cor = (c_{ij})$$, where $$c_{ij}=\frac{v_{ij}}{\sqrt{v_{ii}\cdot v_{jj}}}$$, $$i=\overline{1,p}$$, $$j=\overline{1,p}$$

Computation¶

The following computation modes are available:

Examples¶

Batch Processing: