Elastic Net#
Elastic Net is a method for modeling relationship between a dependent variable (which may be a vector) and one or more explanatory variables by fitting regularized least squares model. Elastic Net regression model has the special penalty, a sum of L1 and L2 regularizations, that takes advantage of both Ridge Regression and LASSO algorithms. This penalty is particularly useful in a situation with many correlated predictor variables [Friedman2010].
Details#
Let \((x_1, \ldots, x_p)\) be a vector of input variables and \(y = (y_1, \ldots, y_k)\) be the response. For each \(j = 1, \ldots, k\), the Elastic Net model has the form similar to linear and ridge regression models [Hoerl70] with one exception: the coefficients are estimated by minimizing mean squared error (MSE) objective function that is regularized by \(L_1\) and \(L_2\) penalties.
Here \(x_i\), \(i = 1, \ldots, p\), are referred to as independent variables, \(y_j\), \(j = 1, \ldots, k\), is referred to as dependent variable or response.
Training Stage#
Let \((x_{11}, \ldots, x_{1p}, y_{11}, \ldots, y_{1k}) \ldots (x_{n1}, \ldots, x_{np}, y_{n1}, \ldots, y_{nk})\) be a set of training data (for regression task, \(n >> p\), and for feature selection \(p\) could be greater than \(n\)). The matrix \(X\) of size \(n \times p\) contains observations \(x_{ij}\), \(i = 1, \ldots, n\), \(j = 1, \ldots, p\) of independent variables.
For each \(y_j\), \(j = 1, \ldots, k\), the Elastic Net regression estimates \((\beta_{0j}, \beta_{1j}, \ldots, \beta_{pj})\) by minimizing the objective function:
In the equation above, the first term is a mean squared error function, the second and the third are regularization terms that penalize the \(L_1\) and \(L_2\) norms of vector \(\beta_j\), where \(\lambda_{1j} \geq 0\), \(\lambda_{2j} \geq 0\), \(j = 1, \ldots, k\).
For more details, see [Hastie2009] and [Friedman2010].
By default, Coordinate Descent iterative solver is used to minimize the objective function. SAGA solver is also applicable for minimization.
Prediction Stage#
Prediction based on Elastic Net regression is done for input vector \((x_1, \ldots, x_p)\) using the equation \(y_j = \beta_{0j} + x_1 \beta_{1j} + \ldots + x_p \beta_{pj}\) for each \(j = 1, \ldots, k\).