# Logistic Regression¶

This chapter describes the Logistic Regression algorithm implemented in oneDAL.

The Logistic Regression algorithm solves the classification problem and predicts class labels and probabilities of objects belonging to each class.

 Operation Computational methods Programming Interface Training dense_batch train(…) train_input train_result Inference dense_batch infer(…) infer_input infer_result

## Mathematical Formulation¶

### Training¶

Given $$n$$ feature vectors $$X=\{x_1=(x_{11},\ldots,x_{1p}),\ldots, x_n=(x_{n1},\ldots,x_{np})\}$$ of size $$p$$ and $$n$$ responses $$Y=\{y_1,\ldots,y_n\} \in \{0,1\}$$, the problem is to fit the model weights $$w=\{w_0, \ldots, w_p\}$$ to minimize Logistic Loss $$L(X, w, y) = \sum_{i = 1}^{n} -y_i \log(prob_i) - (1 - y_i) \log(prob_i)$$. Where * $$prob_i = \sigma(w_0 + \sum_{j=1}^{p} w_j x_{i, j})$$ - predicted probabilities, * $$\sigma(x) = \frac{1}{1 + \exp(-x)}$$ - a sigmoid function. Note that probabilities are binded to interval $$[\epsilon, 1 - \epsilon]$$ to avoid problems with computing log function ($$\epsilon=10^{-7}$$ if float type is used and $$10^{-15}$$ otherwise)

Note

The probabilities are constrained to the interval $$[\epsilon, 1 - \epsilon]$$ to prevent issues when computing the logarithm function. Where $$\epsilon=10^{-7}$$ for float type and $$10^{-15}$$ otherwise.

#### Training Method: dense_batch¶

Since Logistic Loss is a convex function, you can use one of the iterative solvers designed for convex problems for minimization. During training, the data is divided into batches, and the gradients from each batch are summed up.

Refer to Mathematical formulation: Newton-CG.

#### Training Method: sparse¶

Using this method you can train Logistic Regression model on sparse data. All you need is to provide matrix with feature vectors as sparse table. Find more info about sparse tables here Compressed Sparse Rows (CSR) Table:.

### Inference¶

Given $$r$$ feature vectors $$X=\{x_1=(x_{11},\ldots,x_{1p}),\ldots, x_r=(x_{r1},\ldots,x_{rp})\}$$ of size $$p$$, the problem is to calculate the probabilities of associated with these feature vectors belonging to each class and determine the most probable class label for each object.

The probabilities are calculated using this formula $$prob_i = \sigma(w_0 + \sum_{j=1}^{p} w_j x_{i, j})$$. Where $$\sigma(x) = \frac{1}{1 + \exp(-x)}$$ is a sigmoid function. If the probability is bigger than $$0.5$$ then class label is set to $$1$$, otherwise to $$0$$.

## Programming Interface¶

Refer to API Reference: Logistic Regression.

Examples: Logistic Regression

### oneAPI DPC++¶

Batch Processing: