LogitBoost Classifier¶
LogitBoost is a boosting classification algorithm. LogitBoost and AdaBoost are close to each other in the sense that both perform an additive logistic regression. The difference is that AdaBoost minimizes the exponential loss, whereas LogitBoost minimizes the logistic loss.
LogitBoost within oneDAL implements a multi-class classifier.
Details¶
Given \(n\) feature vectors \(x_1 = (x_{11}, \ldots, x_{1p}), \ldots, x_n = (x_{n1}, \ldots, x_{np})\) of size \(p\) and a vector of class labels \(y= (y_1, \ldots, y_n)\), where \(y_i \in K = \{0, \ldots, J-1\}\) describes the class to which the feature vector \(x_i\) belongs and \(J\) is the number of classes, the problem is to build a multi-class LogitBoost classifier.
Training Stage¶
The LogitBoost model is trained using the Friedman method [Friedman00].
Let \(y_{i,j} = I \{x_i \in j\}\) is the indicator that the \(i\)-th feature vector belongs to class \(j\). The scheme below, which uses the stump weak learner, shows the major steps of the algorithm:
Start with weights \(w_{ij} = \frac{1}{n}\), \(F_j(x) = 0\), \(p_j(x) = \frac {1}{J}\), \(i = 1, \ldots, n\), \(j = 0, \ldots, J-1\).
For \(m = 1, \ldots, M\):
Do
For \(j = 1, \ldots, J\)
Do
Compute working responses and weights in the j-th class:
\[w_{ij} = p_i(x_i) (1 - p_i (x_i)), w_{ij} = max(z_{ij},\text{Thr1})\]\[z_{ij} = \frac {(y_{ij} - p_i(x_i))} {w_{ij}}, z_{ij} = \min(\max(z_{ij},-\text{Thr2}), \text{Thr2})\]Fit the function \(f_{mj}(x)\) by a weighted least-squares regression of \(z_{ij}\) to \(x_i\) with weights \(w_{ij}\) using the stump-based approach.
End do
\(f_{mj}(x) = \frac {J-1}{J} (f_{mj}(x) - \frac{1}{J} \sum _{k=1}^{J} f_{mk}(x))\)
\(F_j(x) = F_j(x) + f_{mj}(x)\)
\(p_j(x) = \frac {e^{F_j(x)}}{\sum _{k=1}^{J} e^{F_k(x)}}\)
End do
The result of the model training is a set of \(M\) stumps.
Prediction Stage¶
Given the LogitBoost classifier and \(r\) feature vectors \(x_1, \ldots, x_r\), the problem is to calculate the labels \(\underset{j}{\mathrm{argmax}} F_j(x)\) of the classes to which the feature vectors belong.
Batch Processing¶
LogitBoost classifier follows the general workflow described in Classification Usage Model.
Training¶
For a description of the input and output, refer to Classification Usage Model.
At the training stage, a LogitBoost classifier has the following parameters:
Parameter |
Default Value |
Description |
---|---|---|
|
|
The floating-point type that the algorithm uses for intermediate computations. Can be |
|
|
The computation method used by the LogitBoost classifier. The only training method supported so far is the Friedman method. |
|
DEPRECATED: Pointer to an object of the stump training class. USE INSTEAD: Pointer to an object of the regression stump training class. |
DEPRECATED: Pointer to the training algorithm of the weak learner. By default, a stump weak learner is used. USE INSTEAD: Pointer to the regression training algorithm. By default, a regression stump with mse split criterion is used. |
|
DEPRECATED: Pointer to an object of the stump prediction class. USE INSTEAD: Pointer to an object of the regression stump prediction class. |
DEPRECATED: Pointer to the prediction algorithm of the weak learner. By default, a stump weak learner is used. USE INSTEAD: Pointer to the regression prediction algorithm. By default, a regression stump with mse split criterion is used. |
|
\(0.01\) |
LogitBoost training accuracy. |
|
\(100\) |
The maximal number of iterations for the LogitBoost algorithm. |
|
Not applicable |
The number of classes, a required parameter. |
|
\(1\mathrm{e}-10\) |
The threshold to avoid degenerate cases when calculating weights \(w_{ij}\). |
|
\(1\mathrm{e}-10\) |
The threshold to avoid degenerate cases when calculating responses \(z_{ij}\). |
Prediction¶
For a description of the input and output, refer to Classification Usage Model.
At the prediction stage, a LogitBoost classifier has the following parameters:
Parameter |
Default Value |
Description |
---|---|---|
|
|
The floating-point type that the algorithm uses for intermediate computations. Can be |
|
|
Performance-oriented computation method, the only method supported by the LogitBoost classifier at the prediction stage. |
|
DEPRECATED: Pointer to an object of the stump prediction class. USE INSTEAD: Pointer to an object of the regression stump prediction class. |
DEPRECATED: Pointer to the prediction algorithm of the weak learner. By default, a stump weak learner is used. USE INSTEAD: Pointer to the regression prediction algorithm. By default, a regression stump with mse split criterion is used. |
|
Not applicable |
The number of classes, a required parameter. |
Note
The algorithm terminates if it achieves the specified accuracy or reaches the specified maximal number of iterations.
To determine the actual number of iterations performed,
call the getNumberOfWeakLearners()
method of the LogitBoostModel
class and divide it by nClasses
.
Examples¶
Batch Processing: