Implicit Alternating Least Squares

The library provides the Implicit Alternating Least Squares (implicit ALS) algorithm [Fleischer2008], based on collaborative filtering.

Details

Given the input dataset \(R=\left\{{r}_{ui}\right\}\) of size \(m \times n\), where m is the number of users and n is the number of items, the problem is to train the Alternating Least Squares (ALS) model represented as two matrices: \(X\) of size \(m \times f\), and \(Y\) of size \(f \times n\), where \(f\) is the number of factors. The matrices \(X\) and \(Y\) are the factors of low-rank factorization of matrix \(R\):

\[R\approx X\cdot Y\]

Initialization Stage

Initialization of the matrix Y can be done using the following method: for each \(i = 1, \ldots, n\) \({y}_{1i}=\frac{1}{m}\sum _{u=1}^{m}{r}_{ui}\) and \(y_{ki}\) are independent random numbers uniformly distributed on the interval \((0,1)\), \(k = 2, \ldots, f\).

Training Stage

The ALS model is trained using the implicit ALS algorithm [Hu2008] by minimizing the following cost function:

\[\underset{{x}_{*},{y}_{*}}{\mathrm{min}}\underset{u,i}{\mathrm{\Sigma }}{c}_{ui}{\left({p}_{ui}-{x}_{u}^{T}{y}_{i}\right)}^{2}+\lambda \left(\underset{u}{\mathrm{\Sigma }}{n}_{{x}_{u}}{\|{x}_{u}\|}^{2}+\underset{i}{\mathrm{\Sigma }}{m}_{{y}_{i}}{\|{y}_{i}\|}^{2}\right),\]

where:

  • \({p}_{ui}\) indicates the preference of user u of item i:

    \[\begin{split}p_{ui} = \begin{cases} 1, & {r}_{ui}> \epsilon \\ 0, & {r}_{ui}\le \epsilon \end{cases}\end{split}\]
  • \(\epsilon\) is the threshold used to define the preference values. \(\epsilon = 0\) is the only threshold valu supported so far.

  • \({c}_{ui}=1+\alpha r_{ui}\), \(c_{ui}\) measures the confidence in observing \(p_{ui}\)

  • \(\alpha\) is the rate of confidence

  • \(r_{ui}\) is the element of the matrix \(R\)

  • \(\lambda\) is the parameter of the regularization

  • \({n}_{{x}_{u}}\), \({m}_{{y}_{i}}\) denote the number of ratings of user \(u\) and item \(i\) respectively

Prediction Stage

Prediction of Ratings

Given the trained ALS model and the matrix \(D\) that describes for which pairs of factors \(X\) and \(Y\) the rating should be computed, the system calculates the matrix of recommended ratings Res: \({res}_{ui}=\sum _{j=1}^{f}{x}_{uj}{y}_{ji}\), if \({d}_{ui}\ne 0\), \(u=1,\ldots,m\); \(i=1,\ldots n\).

Initialization

For initialization, the following computation modes are available:

Computation

The following computation modes are available:

Examples

Batch Processing:

Distributed Processing:

Note

There is no support for Java on GPU.

Batch Processing:

Distributed Processing:

Batch Processing:

Performance Considerations

To get the best overall performance of the implicit ALS recommender:

  • If input data is homogeneous, provide the input data and store results in homogeneous numeric tables of the same type as specified in the algorithmFPType class template parameter.

  • If input data is sparse, use CSR numeric tables.

Product and Performance Information

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex​.

Notice revision #20201201