Covariance

In statistics, covariance and correlation are two of the most fundamental measures of linear dependence between two random variables. The covariance and the correlation represent the joint variability of any two features. The correlation is dimensionless, while the covariance is measured in units obtained by multiplying the units of the two features. Another important distinction is that covariance can be affected by the higher variance of one feature, while correlation removes the effect of the variances by normalizing the covariance of two features by their square-root of variances. Their usage is application-dependent. The covariance algorithm computes the following:

  • Means

  • Covariance (sample and estimated by maximum likelihood method)

  • Correlation

Operation

Computational methods

Programming Interface

Computing

dense

compute(…)

compute_input

compute_result

Partial Computing

dense

partial_compute(…)

partial_compute_input

partial_compute_result

Finalize Computing

dense

finalize_compute(…)

partial_compute_result

compute_result

Mathematical formulation

Refer to Developer Guide: Covariance.

Programming Interface

All types and functions in this section are declared in the oneapi::dal::covariance namespace and are available via inclusion of the oneapi/dal/algo/covariance.hpp header file.

Descriptor

template<typename Float = float, typename Method = method::by_default, typename Task = task::by_default>
class descriptor
Template Parameters
  • Float – The floating-point type that the algorithm uses for intermediate computations. Can be float or double.

  • Method – Tag-type that specifies an implementation of algorithm. Can be method::dense.

  • Task – Tag-type that specifies the type of the problem to solve. Can be task::compute.

Constructors

descriptor() = default

Creates a new instance of the class with the default property values.

Properties

bool assume_centered
Getter & Setter
bool get_assume_centered() const
auto & set_assume_centered(const bool &value)
result_option_id result_options

Choose which results should be computed and returned.

Getter & Setter
result_option_id get_result_options() const
auto & set_result_options(const result_option_id &value)
bool bias

Choose if result biased or not.

Getter & Setter
bool get_bias() const
auto & set_bias(const bool &value)

Method tags

struct dense

Tag-type that denotes dense computational method.

using by_default = dense

Alias tag-type for the dense computational method.

Task tags

struct compute

Tag-type that parameterizes entities that are used to compute statistics.

using by_default = compute

Alias tag-type for the compute task.

Training compute(...)

Input

template<typename Task = task::by_default>
class compute_input
Template Parameters

Task – Tag-type that specifies the type of the problem to solve. Can be task::compute.

Constructors

compute_input()
compute_input(const table &data)

Creates a new instance of the class with the given data property value.

Properties

const table &data

An \(n \times p\) table with the training data, where each row stores one feature vector. Default value: table{}.

Getter & Setter
const table & get_data() const
auto & set_data(const table &value)

Result and Finalize Result

template<typename Task = task::by_default>
class compute_result
Template Parameters

Task – Tag-type that specifies the type of the problem to solve. Can be task::compute.

Constructors

compute_result()

Creates a new instance of the class with the default property values.

Properties

const table &cor_matrix

The correlation matrix. Default value: table{}.

Getter & Setter
const table & get_cor_matrix() const
auto & set_cor_matrix(const table &value)
const result_option_id &result_options

Result options that indicates availability of the properties. Default value: default_result_options<Task>.

Getter & Setter
const result_option_id & get_result_options() const
auto & set_result_options(const result_option_id &value)
const table &means

Means. Default value: table{}.

Getter & Setter
const table & get_means() const
auto & set_means(const table &value)
const table &cov_matrix

The covariance matrix. Default value: table{}.

Getter & Setter
const table & get_cov_matrix() const
auto & set_cov_matrix(const table &value)

Operation

template<typename Descriptor>
covariance::compute_result compute(const Descriptor &desc, const covariance::compute_input &input)
Parameters
  • desc – Covariance algorithm descriptor covariance::descriptor

  • input – Input data for the computing operation

Preconditions
input.data.is_empty == false

Partial Training

Partial Input

template<typename Task = task::by_default>
class partial_compute_input

Constructors

partial_compute_input()
partial_compute_input(const table &data)
partial_compute_input(const partial_compute_result<Task> &prev, const table &data)

Properties

const table &data
Getter & Setter
const table & get_data() const
auto & set_data(const table &value)
const partial_compute_result<Task> &prev
Getter & Setter
const partial_compute_result< Task > & get_prev() const
auto & set_prev(const partial_compute_result< Task > &value)

Partial Result and Finalize Input

template<typename Task = task::by_default>
class partial_compute_result

Constructors

partial_compute_result()

Properties

const table &partial_sum

Sums. Default value: table{}.

Getter & Setter
const table & get_partial_sum() const
auto & set_partial_sum(const table &value)
const table &partial_crossproduct

The crossproduct matrix. Default value: table{}.

Getter & Setter
const table & get_partial_crossproduct() const
auto & set_partial_crossproduct(const table &value)
const table &partial_n_rows

The nobs value. Default value: table{}.

Getter & Setter
const table & get_partial_n_rows() const
auto & set_partial_n_rows(const table &value)

Finalize Training

Usage Example

Computing

 void run_computing(const table& data) {
     const auto cov_desc = dal::covariance::descriptor{};

     const auto cov_desc = dal::covariance::descriptor{}.set_result_options(
         dal::covariance::result_options::cor_matrix | dal::covariance::result_options::means);

     const auto result = dal::compute(cov_desc, data);

     std::cout << "Means:\n" << result.get_means() << std::endl;
     std::cout << "Correlation:\n" << result.get_cor_matrix() << std::endl;
}