Logistic Regression¶
This chapter describes the Logistic Regression algorithm implemented in oneDAL.
The Logistic Regression algorithm solves the classification problem and predicts class labels and probabilities of objects belonging to each class.
Operation |
Computational methods |
Programming Interface |
||
Mathematical Formulation¶
Training¶
Given \(n\) feature vectors \(X=\{x_1=(x_{11},\ldots,x_{1p}),\ldots, x_n=(x_{n1},\ldots,x_{np})\}\) of size \(p\) and \(n\) responses \(Y=\{y_1,\ldots,y_n\} \in \{0,1\}\), the problem is to fit the model weights \(w=\{w_0, \ldots, w_p\}\) to minimize Logistic Loss \(L(X, w, y) = \sum_{i = 1}^{n} -y_i \log(prob_i) - (1 - y_i) \log(prob_i)\). Where * \(prob_i = \sigma(w_0 + \sum_{j=1}^{p} w_j x_{i, j})\) - predicted probabilities, * \(\sigma(x) = \frac{1}{1 + \exp(-x)}\) - a sigmoid function. Note that probabilities are binded to interval \([\epsilon, 1 - \epsilon]\) to avoid problems with computing log function (\(\epsilon=10^{-7}\) if float type is used and \(10^{-15}\) otherwise)
Note
The probabilities are constrained to the interval \([\epsilon, 1 - \epsilon]\) to prevent issues when computing the logarithm function. Where \(\epsilon=10^{-7}\) for float type and \(10^{-15}\) otherwise.
Training Method: dense_batch¶
Since Logistic Loss is a convex function, you can use one of the iterative solvers designed for convex problems for minimization. During training, the data is divided into batches, and the gradients from each batch are summed up.
Refer to Mathematical formulation: Newton-CG.
Training Method: sparse¶
Using this method you can train Logistic Regression model on sparse data. All you need is to provide matrix with feature vectors as sparse table. Find more info about sparse tables here Compressed Sparse Rows (CSR) Table:.
Inference¶
Given \(r\) feature vectors \(X=\{x_1=(x_{11},\ldots,x_{1p}),\ldots, x_r=(x_{r1},\ldots,x_{rp})\}\) of size \(p\), the problem is to calculate the probabilities of associated with these feature vectors belonging to each class and determine the most probable class label for each object.
The probabilities are calculated using this formula \(prob_i = \sigma(w_0 + \sum_{j=1}^{p} w_j x_{i, j})\). Where \(\sigma(x) = \frac{1}{1 + \exp(-x)}\) is a sigmoid function. If the probability is bigger than \(0.5\) then class label is set to \(1\), otherwise to \(0\).
Programming Interface¶
Refer to API Reference: Logistic Regression.
Examples: Logistic Regression
oneAPI DPC++¶
- Batch Processing: