# Principal Components Analysis (PCA)¶

Principal Component Analysis (PCA) is an algorithm for exploratory data analysis and dimensionality reduction. PCA transforms a set of feature vectors of possibly correlated features to a new set of uncorrelated features, called principal components. Principal components are the directions of the largest variance, that is, the directions where the data is mostly spread out.

 Operation Computational methods Programming Interface Training Covariance SVD train(…) train_input train_result Inference Covariance SVD infer(…) infer_input infer_result

## Mathematical formulation¶

### Training¶

Given the training set $$X = \{ x_1, \ldots, x_n \}$$ of $$p$$-dimensional feature vectors and the number of principal components $$r$$, the problem is to compute $$r$$ principal directions ($$p$$-dimensional eigenvectors [Lang87]) for the training set. The eigenvectors can be grouped into the $$r \times p$$ matrix $$T$$ that contains one eigenvector in each row.

#### Training method: Covariance¶

This method uses eigenvalue decomposition of the covariance matrix to compute the principal components of the datasets. The method relies on the following steps:

1. Computation of the covariance matrix

2. Computation of the eigenvectors and eigenvalues

3. Formation of the matrices storing the results

Covariance matrix computation is performed in the following way:

1. Compute the vector-column of sums $$s_i = \sum_{j=1}^n x_{i,j}, \quad 1 \leq i \leq p$$.

2. Compute the cross-product $$P = X^TX - s^Ts$$.

3. Compute the covariance matrix $$\Sigma = \frac{1}{n - 1} P$$.

To compute eigenvalues $$\lambda_i$$ and eigenvectors $$\upsilon_i$$, the implementer can choose an arbitrary method such as [Ping14].

The final step is to sort the set of pairs $$(\lambda_i, \upsilon_i)$$ in the descending order by $$\lambda_i$$ and form the resulting matrix $$T = (\upsilon_{i,1}, \cdots, \upsilon_{i,r}), \quad 1 \leq i \leq p$$. Additionally, the means and variances of the initial dataset are returned.

#### Training method: SVD¶

This method uses singular value decomposition of the dataset to compute its principal components. The method relies on the following steps:

1. Computation of the singular values and singular vectors

2. Formation of the matrices storing the results

To compute singular values $$\lambda_i$$ and singular vectors $$u_i$$ and $$v_i$$, the implementer can choose an arbitrary method such as [Demmel90].

The final step is to sort the set of pairs $$(\lambda_i, v_i)$$ in the descending order by $$\lambda_i$$ and form the resulting matrix $$T = (v_{i,1}, \cdots, v_{i,r}), \quad 1 \leq i \leq p$$. Additionally, the means and variances of the initial dataset are returned.

#### Sign-flip technique¶

Eigenvectors computed by some eigenvalue solvers are not uniquely defined due to sign ambiguity. To get the deterministic result, a sign-flip technique should be applied. One of the sign-flip techniques proposed in [Bro07] requires the following modification of matrix $$T$$:

$\hat{T}_i = T_i \cdot \mathrm{sgn}(\max_{1 \leq j \leq p } |{T}_{ij}|), \quad 1 \leq i \leq r,$

where $$T_i$$ is $$i$$-th row, $$T_{ij}$$ is the element in the $$i$$-th row and $$j$$-th column, $$\mathrm{sgn}(\cdot)$$ is the signum function,

$\begin{split}\mathrm{sgn}(x) = \begin{cases} -1, & x < 0, \\ 0, & x = 0, \\ 1, & x > 0. \end{cases}\end{split}$

### Inference¶

Given the inference set $$X' = \{ x_1', \ldots, x_m' \}$$ of $$p$$-dimensional feature vectors and the $$r \times p$$ matrix $$T$$ produced at the training stage, the problem is to transform $$X'$$ to the set $$X'' = \{ x_1'', \ldots, x_m'' \}$$, where $$x_{j}''$$ is an $$r$$-dimensional feature vector, $$1 \leq j \leq m$$.

The feature vector $$x_{j}''$$ is computed through applying linear transformation [Lang87] defined by the matrix $$T$$ to the feature vector $$x_{j}'$$,

(1)$x_{j}'' = T x_{j}', \quad 1 \leq j \leq m.$

#### Inference methods: Covariance and SVD¶

Covariance and SVD inference methods compute $$x_{j}''$$ according to (1).

## Usage example¶

### Training¶

pca::model<> run_training(const table& data) {
const auto pca_desc = pca::descriptor<float>{}
.set_component_count(5)
.set_deterministic(true);

const auto result = train(pca_desc, data);

print_table("means", result.get_means());
print_table("variances", result.get_variances());
print_table("eigenvalues", result.get_eigenvalues());
print_table("eigenvectors", result.get_eigenvectors());

return result.get_model();
}


### Inference¶

table run_inference(const pca::model<>& model,
const table& new_data) {
const auto pca_desc = pca::descriptor<float>{}
.set_component_count(model.get_component_count());

const auto result = infer(pca_desc, model, new_data);

print_table("labels", result.get_transformed_data());
}


## Examples¶

Batch Processing: