.. ******************************************************************************
.. * Copyright 2020-2021 Intel Corporation
.. *
.. * Licensed under the Apache License, Version 2.0 (the "License");
.. * you may not use this file except in compliance with the License.
.. * You may obtain a copy of the License at
.. *
.. * http://www.apache.org/licenses/LICENSE-2.0
.. *
.. * Unless required by applicable law or agreed to in writing, software
.. * distributed under the License is distributed on an "AS IS" BASIS,
.. * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
.. * See the License for the specific language governing permissions and
.. * limitations under the License.
.. *******************************************************************************/
.. _cor_cov_distributed:
Distributed Processing
======================
This mode assumes that the data set is split into ``nblocks`` blocks across computation nodes.
Algorithm Parameters
********************
The correlation and variance-covariance matrices algorithm in the distributed processing mode has the following parameters:
.. tabularcolumns:: |\Y{0.15}|\Y{0.15}|\Y{0.7}|
.. list-table:: Algorithm Parameters for Correlation and Variance-Covariance Matrices Algorithm (Distributed Processing)
:widths: 10 10 60
:header-rows: 1
:class: longtable
* - Parameter
- Default Valude
- Description
* - ``computeStep``
- Not applicable
- The parameter required to initialize the algorithm. Can be:
- ``step1Local`` - the first step, performed on local nodes
- ``step2Master`` - the second step, performed on a master node
* - ``algorithmFPType``
- ``float``
- The floating-point type that the algorithm uses for intermediate computations. Can be ``float`` or ``double``.
* - ``method``
- ``defaultDense``
- Available methods for computation of low order moments:
defaultDense
default performance-oriented method
singlePassDense
implementation of the single-pass algorithm proposed by D.H.D. West
sumDense
implementation of the algorithm in the cases where the basic statistics associated with
the numeric table are pre-computed sums; returns an error if pre-computed sums are not defined
fastCSR
performance-oriented method for CSR numeric tables
singlePassCSR
implementation of the single-pass algorithm proposed by D.H.D. West; optimized for CSR numeric tables
sumCSR
implementation of the algorithm in the cases where the basic statistics associated with
the numeric table are pre-computed sums; optimized for CSR numeric tables;
returns an error if pre-computed sums are not defined
* - ``outputMatrixType``
- ``covarianceMatrix``
- The type of the output matrix. Can be:
- ``covarianceMatrix`` - variance-covariance matrix
- ``correlationMatrix`` - correlation matrix
Computation of correlation and variance-covariance matrices follows the general schema described in :ref:`algorithms`:
.. _cor_cov_step_1:
Step 1 - on Local Nodes
***********************
In this step, the correlation and variance-covariance matrices algorithm accepts the input described below.
Pass the ``Input ID`` as a parameter to the methods that provide input for your algorithm.
For more details, see :ref:`algorithms`.
.. tabularcolumns:: |\Y{0.2}|\Y{0.8}|
.. list-table:: Step 1: Algorithm Input for Correlation and Variance-Covariance Matrices Algorithm (Distributed Processing)
:widths: 10 60
:header-rows: 1
* - Input ID
- Input
* - ``data``
- Pointer to the numeric table of size :math:`n_i \times p` that represents the :math:`i`-th data block on the local node.
While the input for ``defaultDense``, ``singlePassDense``, or ``sumDense`` method can be an object of any class derived
from ``NumericTable``, the input for ``fastCSR``, ``singlePassCSR``, or ``sumCSR`` method can only be an object of
the ``CSRNumericTable`` class.
In this step, the correlation and variance-covariance matrices algorithm calculates the results described below.
Pass the ``Result ID`` as a parameter to the methods that access the results of your algorithm.
For more details, see :ref:`algorithms`.
.. tabularcolumns:: |\Y{0.2}|\Y{0.8}|
.. list-table:: Step 1: Algorithm Output for Correlation and Variance-Covariance Matrices Algorithm (Distributed Processing)
:widths: 10 60
:header-rows: 1
:class: longtable
* - Result ID
- Result
* - ``nObservations``
- Pointer to the :math:`1 \times 1` numeric table that contains the number of observations processed so far on the local node.
.. note::
By default, this result is an object of the ``HomogenNumericTable`` class,
but you can define the result as an object of any class derived from ``NumericTable``
except ``CSRNumericTable``.
* - ``crossProduct``
- Pointer to :math:`p \times p` numeric table with the cross-product matrix computed so far on the local node.
.. note::
By default, this table is an object of the ``HomogenNumericTable`` class,
but you can define the result as an object of any class derived from ``NumericTable``
except ``PackedSymmetricMatrix``, ``PackedTriangularMatrix``, and ``CSRNumericTable``.
* - ``sum``
- Pointer to :math:`1 \times p` numeric table with partial sums computed so far on the local node.
.. note::
By default, this table is an object of the ``HomogenNumericTable`` class,
but you can define the result as an object of any class derived from ``NumericTable``
except ``PackedSymmetricMatrix``, ``PackedTriangularMatrix``, and ``CSRNumericTable``.
.. _cor_cov_step_2:
Step 2 - on Master Node
***********************
In this step, the correlation and variance-covariance matrices algorithm accepts the input described below.
Pass the ``Input ID`` as a parameter to the methods that provide input for your algorithm.
For more details, see :ref:`algorithms`.
.. tabularcolumns:: |\Y{0.2}|\Y{0.8}|
.. list-table:: Step 2: Algorithm Input for Correlation and Variance-Covariance Matrices Algorithm (Distributed Processing)
:widths: 10 60
:header-rows: 1
* - Input ID
- Input
* - ``partialResults``
- A collection that contains results computed in :ref:`Step 1 ` on local nodes (``nObservations``, ``crossProduct``, and ``sum``).
.. note::
The collection can contain objects of any class derived from the ``NumericTable`` class
except ``PackedSymmetricMatrix`` and ``PackedTriangularMatrix``.
In this step, the correlation and variance-covariance matrices algorithm calculates the results described in the following table.
Pass the ``Result ID`` as a parameter to the methods that access the results of your algorithm.
For more details, see :ref:`algorithms`.
.. tabularcolumns:: |\Y{0.2}|\Y{0.8}|
.. list-table:: Step 2: Algorithm Output for for Correlation and Variance-Covariance Matrices Algorithm (Distributed Processing)
:widths: 10 60
:header-rows: 1
:class: longtable
* - Result ID
- Result
* - ``covariance``
- Use when ``outputMatrixType``=``covarianceMatrix``. Pointer to the numeric table with the :math:`p \times p` variance-covariance matrix.
.. note::
By default, this result is an object of the ``HomogenNumericTable`` class,
but you can define the result as an object of any class derived from ``NumericTable``
except ``PackedTriangularMatrix`` and ``CSRNumericTable``.
* - ``correlation``
- Use when ``outputMatrixType``=``correlationMatrix``. Pointer to the numeric table with the :math:`p \times p` correlation matrix.
.. note::
By default, this result is an object of the ``HomogenNumericTable`` class,
but you can define the result as an object of any class derived from ``NumericTable``
except ``PackedTriangularMatrix`` and ``CSRNumericTable``.
* - ``mean``
- Pointer to the :math:`1 \times p` numeric table with means.
.. note::
By default, this result is an object of the ``HomogenNumericTable`` class,
but you can define the result as an object of any class derived from ``NumericTable``
except ``PackedTriangularMatrix``, ``PackedSymmetricMatrix``, and ``CSRNumericTable``.
.. include:: ../../../opt-notice.rst