Logistic Regression#

This chapter describes the Logistic Regression algorithm implemented in oneDAL.

The Logistic Regression algorithm solves the classification problem and predicts class labels and probabilities of objects belonging to each class.

Operation

Computational methods

Programming Interface

Training

dense_batch

train(…)

train_input

train_result

Inference

dense_batch

infer(…)

infer_input

infer_result

Mathematical Formulation#

Training#

Given \(n\) feature vectors \(X=\{x_1=(x_{11},\ldots,x_{1p}),\ldots, x_n=(x_{n1},\ldots,x_{np})\}\) of size \(p\) and \(n\) responses \(Y=\{y_1,\ldots,y_n\} \in \{0,1\}\), the problem is to fit the model weights \(w=\{w_0, \ldots, w_p\}\) to minimize Logistic Loss \(L(X, w, y) = \sum_{i = 1}^{n} -y_i \log(prob_i) - (1 - y_i) \log(prob_i)\). Where * \(prob_i = \sigma(w_0 + \sum_{j=1}^{p} w_j x_{i, j})\) - predicted probabilities, * \(\sigma(x) = \frac{1}{1 + \exp(-x)}\) - a sigmoid function. Note that probabilities are binded to interval \([\epsilon, 1 - \epsilon]\) to avoid problems with computing log function (\(\epsilon=10^{-7}\) if float type is used and \(10^{-15}\) otherwise)

Note

The probabilities are constrained to the interval \([\epsilon, 1 - \epsilon]\) to prevent issues when computing the logarithm function. Where \(\epsilon=10^{-7}\) for float type and \(10^{-15}\) otherwise.

Training Method: dense_batch#

Since Logistic Loss is a convex function, you can use one of the iterative solvers designed for convex problems for minimization. During training, the data is divided into batches, and the gradients from each batch are summed up.

Refer to Mathematical formulation: Newton-CG.

Inference#

Given \(r\) feature vectors \(X=\{x_1=(x_{11},\ldots,x_{1p}),\ldots, x_r=(x_{r1},\ldots,x_{rp})\}\) of size \(p\), the problem is to calculate the probabilities of associated with these feature vectors belonging to each class and determine the most probable class label for each object.

The probabilities are calculated using this formula \(prob_i = \sigma(w_0 + \sum_{j=1}^{p} w_j x_{i, j})\). Where \(\sigma(x) = \frac{1}{1 + \exp(-x)}\) is a sigmoid function. If the probability is bigger than \(0.5\) then class label is set to \(1\), otherwise to \(0\).

Programming Interface#

Refer to API Reference: Logistic Regression.

Examples: Logistic Regression

oneAPI DPC++#

Batch Processing: