.. ****************************************************************************** .. * Copyright 2019 Intel Corporation .. * .. * Licensed under the Apache License, Version 2.0 (the "License"); .. * you may not use this file except in compliance with the License. .. * You may obtain a copy of the License at .. * .. * http://www.apache.org/licenses/LICENSE-2.0 .. * .. * Unless required by applicable law or agreed to in writing, software .. * distributed under the License is distributed on an "AS IS" BASIS, .. * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. .. * See the License for the specific language governing permissions and .. * limitations under the License. .. *******************************************************************************/ .. re-use for math equations: .. |x_vector| replace:: :math:`(x_1, \ldots, x_p)` .. _linear_regression: Linear Regression ================= .. toctree:: :glob: :maxdepth: 4 Linear regression is a method for modeling the relationship between a dependent variable (which may be a vector) and one or more explanatory variables by fitting linear equations to observed data. Details ******* Let |x_vector| be a vector of input variables and :math:`y=(y_1, \ldots, y_k)` be the response. For each :math:`j=1, \ldots ,k`, the linear regression model has the format [Hastie2009]_: .. math:: y_j = \beta_{0j} + \beta_{1j} x_1 + \ldots + \beta_{pj} x_p Here :math:`x_i`, :math:`i=1, \ldots,p`, are referred to as independent variables, and :math:`y_j` are referred to as dependent variables or responses. Linear regression is called: - **Simple Linear Regression** (if there is only one explanatory variable) - **Multiple Linear Regression** (if the number of explanatory variables :math:`p > 1`) Training Stage -------------- Let :math:`(x_{11}, \ldots, x_{1p}, y_1, \ldots, x_{n1}, \ldots, x_{np}, y_n)` be a set of training data, :math:`n \gg p`. The matrix :math:`X` of size :math:`n \times p` contains observations :math:`x_{ij}`, :math:`i=1, \ldots, n`, :math:`j = 1, \ldots, p` of independent variables. To estimate the coefficients :math:`(\beta_{0j}, \ldots, \beta_{pj})` one these methods can be used: - Normal Equation system - QR matrix decomposition Prediction Stage ---------------- Linear regression based prediction is done for input vector |x_vector| using the equation :math:`y_j = \beta_{0j} + \beta_{1j}x_1 + \ldots + \beta_{pj}x_p` for each :math:`j=1, \ldots, k`. Usage of Training Alternative ***************************** To build a Linear Regression model using methods of the Model Builder class of Linear Regression, complete the following steps: - Create a Linear Regression model builder using a constructor with the required number of responses and features. - Use the ``setBeta`` method to add the set of pre-calculated coefficients to the model. Specify random access iterators to the first and the last element of the set of coefficients [ISO/IEC 14882:2011 ยง24.2.7]_. .. note:: If your set of coefficients does not contain an intercept, ``interceptFlag`` is automatically set to ``False``, and to ``True``, otherwise. - Use the ``getModel`` method to get the trained Linear Regression model. - Use the ``getStatus`` method to check the status of the model building process. If ``DAAL_NOTHROW_EXCEPTIONS`` macros is defined, the status report contains the list of errors that describe the problems API encountered (in case of API runtime failure). .. note:: If after calling the ``getModel`` method you use the ``setBeta`` method to update coefficients, the initial model will be automatically updated with the new :math:`\beta` coefficients. Examples -------- .. tabs:: .. tab:: C++ (CPU) - :cpp_example:`lin_reg_model_builder.cpp ` .. tab:: Python* - :daal4py_example:`lin_reg_model.py`