Covariance¶

Covariance algorithm computes the following set of quantitative dataset characteristics:

• means

• covariance

• correlation

 Operation Computational methods Programming Interface dense dense compute(…) compute_input compute_result

Mathematical formulation¶

Computing¶

Given a set $$X$$ of $$n$$ $$p$$-dimensional feature vectors $$x_1 = (x_{11}, \ldots, x_{1p}), \ldots, x_n = (x_{n1}, \ldots, x_{np})$$, the problem is to compute the sample means or the covariance matrix or the correlation matrix:

Statistic

Definition

Means

$$M = (m(1), \ldots , m(p))$$, where $$m\left(j\right)=\frac{1}{n}\sum _{i}{x}_{ij}$$

Covariance matrix

$$Cov = (v_{ij})$$, where $$v_{ij}=\frac{1}{n-1}\sum_{k=1}^{n}(x_{ki}-m(i))(x_{kj}-m(j))$$, $$i=\overline{1,p}$$, $$j=\overline{1,p}$$

Correlation matrix

$$Cor = (c_{ij})$$, where $$c_{ij}=\frac{v_{ij}}{\sqrt{v_{ii}\cdot v_{jj}}}$$, $$i=\overline{1,p}$$, $$j=\overline{1,p}$$

Computation method: dense¶

The method computes the means or the variance-covariance matrix or the correlation matrix

Programming Interface¶

Refer to API Reference: Covariance.

Distributed mode¶

The algorithm supports distributed execution in SMPD mode (only on GPU).