Covariance#

In statistics, covariance and correlation are two of the most fundamental measures of linear dependence between two random variables. The covariance and the correlation represent the joint variability of any two features. The correlation is dimensionless, while the covariance is measured in units obtained by multiplying the units of the two features. Another important distinction is that covariance can be affected by the higher variance of one feature, while correlation removes the effect of the variances by normalizing the covariance of two features by their square-root of variances. Their usage is application-dependent. The covariance algorithm computes the following:

• Means

• Covariance (sample and estimated by maximum likelihood method)

• Correlation

 Operation Computational methods Programming Interface Computing dense compute(…) compute_input compute_result Partial Computing dense partial_compute(…) partial_compute_input partial_compute_result Finalize Computing dense finalize_compute(…) partial_compute_result compute_result

Mathematical formulation#

Refer to Developer Guide: Covariance.

Programming Interface#

All types and functions in this section are declared in the oneapi::dal::covariance namespace and are available via inclusion of the oneapi/dal/algo/covariance.hpp header file.

Descriptor#

template<typename Float = float, typename Method = method::by_default, typename Task = task::by_default>
class descriptor#
Template Parameters
• Float – The floating-point type that the algorithm uses for intermediate computations. Can be float or double.

• Method – Tag-type that specifies an implementation of algorithm. Can be method::dense.

• Task – Tag-type that specifies the type of the problem to solve. Can be task::compute.

Constructors

descriptor() = default#

Creates a new instance of the class with the default property values.

Properties

bool bias#

Choose if result biased or not.

Getter & Setter
bool get_bias() const
auto & set_bias(const bool &value)
result_option_id result_options#

Choose which results should be computed and returned.

Getter & Setter
result_option_id get_result_options() const
auto & set_result_options(const result_option_id &value)

Method tags#

struct dense#

Tag-type that denotes dense computational method.

using by_default = dense#

Alias tag-type for the dense computational method.

struct compute#

Tag-type that parameterizes entities that are used to compute statistics.

using by_default = compute#

Alias tag-type for the compute task.

Training compute(...)#

Input#

class compute_input#
Template Parameters

Task – Tag-type that specifies the type of the problem to solve. Can be task::compute.

Constructors

compute_input()#
compute_input(const table &data)#

Creates a new instance of the class with the given data property value.

Properties

const table &data#

An $$n \times p$$ table with the training data, where each row stores one feature vector. Default value: table{}.

Getter & Setter
const table & get_data() const
auto & set_data(const table &value)

Result and Finalize Result#

class compute_result#
Template Parameters

Task – Tag-type that specifies the type of the problem to solve. Can be task::compute.

Constructors

compute_result()#

Creates a new instance of the class with the default property values.

Properties

const table &cov_matrix#

The covariance matrix. Default value: table{}.

Getter & Setter
const table & get_cov_matrix() const
auto & set_cov_matrix(const table &value)
const table &means#

Means. Default value: table{}.

Getter & Setter
const table & get_means() const
auto & set_means(const table &value)
const table &cor_matrix#

The correlation matrix. Default value: table{}.

Getter & Setter
const table & get_cor_matrix() const
auto & set_cor_matrix(const table &value)
const result_option_id &result_options#

Result options that indicates availability of the properties. Default value: default_result_options<Task>.

Getter & Setter
const result_option_id & get_result_options() const
auto & set_result_options(const result_option_id &value)

Operation#

template<typename Descriptor>
covariance::compute_result compute(const Descriptor &desc, const covariance::compute_input &input)#
Parameters
• desc – Covariance algorithm descriptor covariance::descriptor

• input – Input data for the computing operation

Preconditions
input.data.is_empty == false

Partial Training#

Partial Input#

class partial_compute_input#

Constructors

partial_compute_input()#
partial_compute_input(const table &data)#
partial_compute_input(const partial_compute_result<Task> &prev, const table &data)#

Properties

const table &data#
Getter & Setter
const table & get_data() const
auto & set_data(const table &value)
Getter & Setter
const partial_compute_result< Task > & get_prev() const
auto & set_prev(const partial_compute_result< Task > &value)

Partial Result and Finalize Input#

class partial_compute_result#

Constructors

partial_compute_result()#

Properties

const table &partial_sum#

Sums. Default value: table{}.

Getter & Setter
const table & get_partial_sum() const
auto & set_partial_sum(const table &value)
const table &partial_crossproduct#

The crossproduct matrix. Default value: table{}.

Getter & Setter
const table & get_partial_crossproduct() const
auto & set_partial_crossproduct(const table &value)
const table &partial_n_rows#

The nobs value. Default value: table{}.

Getter & Setter
const table & get_partial_n_rows() const
auto & set_partial_n_rows(const table &value)

Usage Example#

Computing#

 void run_computing(const table& data) {
const auto cov_desc = dal::covariance::descriptor{};

const auto cov_desc = dal::covariance::descriptor{}.set_result_options(
dal::covariance::result_options::cor_matrix | dal::covariance::result_options::means);

const auto result = dal::compute(cov_desc, data);

std::cout << "Means:\n" << result.get_means() << std::endl;
std::cout << "Correlation:\n" << result.get_cor_matrix() << std::endl;
}