.. ****************************************************************************** .. * Copyright 2019-2022 Intel Corporation .. * .. * Licensed under the Apache License, Version 2.0 (the "License"); .. * you may not use this file except in compliance with the License. .. * You may obtain a copy of the License at .. * .. * http://www.apache.org/licenses/LICENSE-2.0 .. * .. * Unless required by applicable law or agreed to in writing, software .. * distributed under the License is distributed on an "AS IS" BASIS, .. * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. .. * See the License for the specific language governing permissions and .. * limitations under the License. .. *******************************************************************************/ Implicit Alternating Least Squares ================================== .. toctree:: :glob: :maxdepth: 4 The library provides the Implicit Alternating Least Squares (implicit ALS) algorithm [Fleischer2008]_, based on collaborative filtering. Details ******* Given the input dataset :math:R=\left\{{r}_{ui}\right\} of size :math:m \times n, where m is the number of users and n is the number of items, the problem is to train the Alternating Least Squares (ALS) model represented as two matrices: :math:X of size :math:m \times f, and :math:Y of size :math:f \times n, where :math:f is the number of factors. The matrices :math:X and :math:Y are the factors of low-rank factorization of matrix :math:R: .. math:: R\approx X\cdot Y Initialization Stage -------------------- Initialization of the matrix Y can be done using the following method: for each :math:i = 1, \ldots, n :math:{y}_{1i}=\frac{1}{m}\sum _{u=1}^{m}{r}_{ui} and :math:y_{ki} are independent random numbers uniformly distributed on the interval :math:(0,1), :math:k = 2, \ldots, f. Training Stage -------------- The ALS model is trained using the implicit ALS algorithm [Hu2008]_ by minimizing the following cost function: .. math:: \underset{{x}_{*},{y}_{*}}{\mathrm{min}}\underset{u,i}{\mathrm{\Sigma }}{c}_{ui}{\left({p}_{ui}-{x}_{u}^{T}{y}_{i}\right)}^{2}+\lambda \left(\underset{u}{\mathrm{\Sigma }}{n}_{{x}_{u}}{\|{x}_{u}\|}^{2}+\underset{i}{\mathrm{\Sigma }}{m}_{{y}_{i}}{\|{y}_{i}\|}^{2}\right), where: - :math:{p}_{ui} indicates the preference of user u of item i: .. math:: p_{ui} = \begin{cases} 1, & {r}_{ui}> \epsilon \\ 0, & {r}_{ui}\le \epsilon \end{cases} - :math:\epsilon is the threshold used to define the preference values. :math:\epsilon = 0 is the only threshold valu supported so far. - :math:{c}_{ui}=1+\alpha r_{ui}, :math:c_{ui} measures the confidence in observing :math:p_{ui} - :math:\alpha is the rate of confidence - :math:r_{ui} is the element of the matrix :math:R - :math:\lambda is the parameter of the regularization - :math:{n}_{{x}_{u}}, :math:{m}_{{y}_{i}} denote the number of ratings of user :math:u and item :math:i respectively Prediction Stage ---------------- .. rubric:: Prediction of Ratings Given the trained ALS model and the matrix :math:D that describes for which pairs of factors :math:X and :math:Y the rating should be computed, the system calculates the matrix of recommended ratings Res: :math:{res}_{ui}=\sum _{j=1}^{f}{x}_{uj}{y}_{ji}, if :math:{d}_{ui}\ne 0, :math:u=1,\ldots,m; :math:i=1,\ldots n. Initialization ************** For initialization, the following computation modes are available: .. toctree:: :maxdepth: 1 initialization-batch.rst initialization-distributed.rst Computation *********** The following computation modes are available: - :ref:implicit_als_batch_computation - Distributed processing for :ref:training  and :ref:prediction of ratings  .. toctree:: :maxdepth: 1 :hidden: computation-batch.rst computation-distributed-training.rst computation-distributed-prediction.rst Examples ******** .. tabs:: .. tab:: C++ (CPU) Batch Processing: - :cpp_example:impl_als_dense_batch.cpp  - :cpp_example:impl_als_csr_batch.cpp  Distributed Processing: - :cpp_example:impl_als_csr_distr.cpp  .. tab:: Java* .. note:: There is no support for Java on GPU. Batch Processing: - :java_example:ImplAlsDenseBatch.java  - :java_example:ImplAlsCSRBatch.java  Distributed Processing: - :java_example:ImplAlsCSRDistr.java  .. tab:: Python* Batch Processing: - :daal4py_example:implicit_als_batch.py Performance Considerations ************************** To get the best overall performance of the implicit ALS recommender: - If input data is homogeneous, provide the input data and store results in homogeneous numeric tables of the same type as specified in the algorithmFPType class template parameter. - If input data is sparse, use CSR numeric tables. .. include:: ../../../opt-notice.rst