Home | Trees | Indices | Help |
|
---|
|
Logistic Regression (aka logit, MaxEnt) classifier. This node has been automatically generated by wrapping the ``sklearn.linear_model.logistic.LogisticRegression`` class from the ``sklearn`` library. The wrapped instance can be accessed through the ``scikits_alg`` attribute. In the multiclass case, the training algorithm uses the one-vs-rest (OvR) scheme if the 'multi_class' option is set to 'ovr' and uses the cross-entropy loss, if the 'multi_class' option is set to 'multinomial'. (Currently the 'multinomial' option is supported only by the 'lbfgs' and 'newton-cg' solvers.) This class implements regularized logistic regression using the `liblinear` library, newton-cg and lbfgs solvers. It can handle both dense and sparse input. Use C-ordered arrays or CSR matrices containing 64-bit floats for optimal performance; any other input format will be converted (and copied). The newton-cg and lbfgs solvers support only L2 regularization with primal formulation. The liblinear solver supports both L1 and L2 regularization, with a dual formulation only for the L2 penalty. Read more in the :ref:`User Guide <logistic_regression>`. **Parameters** penalty : str, 'l1' or 'l2' Used to specify the norm used in the penalization. The newton-cg and lbfgs solvers support only l2 penalties. dual : bool Dual or primal formulation. Dual formulation is only implemented for l2 penalty with liblinear solver. Prefer dual=False when n_samples > n_features. C : float, optional (default=1.0) Inverse of regularization strength; must be a positive float. Like in support vector machines, smaller values specify stronger regularization. fit_intercept : bool, default: True Specifies if a constant (a.k.a. bias or intercept) should be added to the decision function. intercept_scaling : float, default: 1 Useful only if solver is liblinear. when self.fit_intercept is True, instance vector x becomes [x, self.intercept_scaling], i.e. a "synthetic" feature with constant value equals to intercept_scaling is appended to the instance vector. The intercept becomes intercept_scaling * synthetic feature weight Note! the synthetic feature weight is subject to l1/l2 regularization as all other features. To lessen the effect of regularization on synthetic feature weight (and therefore on the intercept) intercept_scaling has to be increased. class_weight : dict or 'balanced', optional Weights associated with classes in the form ``{class_label: weight}``. If not given, all classes are supposed to have weight one. The "balanced" mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as ``n_samples / (n_classes * np.bincount(y))`` Note that these weights will be multiplied with sample_weight (passed through the fit method) if sample_weight is specified. .. versionadded:: 0.17 *class_weight='balanced'* instead of deprecated *class_weight='auto'*. max_iter : int Useful only for the newton-cg, sag and lbfgs solvers. Maximum number of iterations taken for the solvers to converge. random_state : int seed, RandomState instance, or None (default) The seed of the pseudo random number generator to use when shuffling the data. solver : {'newton-cg', 'lbfgs', 'liblinear', 'sag'} Algorithm to use in the optimization problem. - For small datasets, 'liblinear' is a good choice, whereas 'sag' is faster for large ones. - For multiclass problems, only 'newton-cg' and 'lbfgs' handle multinomial loss; 'sag' and 'liblinear' are limited to one-versus-rest schemes. - 'newton-cg', 'lbfgs' and 'sag' only handle L2 penalty. Note that 'sag' fast convergence is only guaranteed on features with approximately the same scale. You can preprocess the data with a scaler from sklearn.preprocessing. .. versionadded:: 0.17 Stochastic Average Gradient descent solver. tol : float, optional Tolerance for stopping criteria. multi_class : str, {'ovr', 'multinomial'} Multiclass option can be either 'ovr' or 'multinomial'. If the option chosen is 'ovr', then a binary problem is fit for each label. Else the loss minimised is the multinomial loss fit across the entire probability distribution. Works only for the 'lbfgs' solver. verbose : int For the liblinear and lbfgs solvers set verbose to any positive number for verbosity. warm_start : bool, optional When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution. Useless for liblinear solver. .. versionadded:: 0.17 *warm_start* to support *lbfgs*, *newton-cg*, *sag* solvers. n_jobs : int, optional Number of CPU cores used during the cross-validation loop. If given a value of -1, all cores are used. **Attributes** ``coef_`` : array, shape (n_classes, n_features) Coefficient of the features in the decision function. ``intercept_`` : array, shape (n_classes,) Intercept (a.k.a. bias) added to the decision function. If `fit_intercept` is set to False, the intercept is set to zero. ``n_iter_`` : array, shape (n_classes,) or (1, ) Actual number of iterations for all classes. If binary or multinomial, it returns only 1 element. For liblinear solver, only the maximum number of iteration across all classes is given. See also SGDClassifier : incrementally trained logistic regression (when given the parameter ``loss="log"``). sklearn.svm.LinearSVC : learns SVM models using the same algorithm. **Notes** The underlying C implementation uses a random number generator to select features when fitting the model. It is thus not uncommon, to have slightly different results for the same input data. If that happens, try with a smaller tol parameter. Predict output may not match that of standalone liblinear in certain cases. See :ref:`differences from liblinear <liblinear_differences>` in the narrative documentation. **References** LIBLINEAR -- A Library for Large Linear Classification http://www.csie.ntu.edu.tw/~cjlin/liblinear/ Hsiang-Fu Yu, Fang-Lan Huang, Chih-Jen Lin (2011). Dual coordinate descent methods for logistic regression and maximum entropy models. Machine Learning 85(1-2):41-75. http://www.csie.ntu.edu.tw/~cjlin/papers/maxent_dual.pdf
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
Inherited from Inherited from Inherited from |
|||
Inherited from ClassifierCumulator | |||
---|---|---|---|
|
|||
|
|||
|
|||
Inherited from ClassifierNode | |||
|
|||
|
|||
|
|||
|
|||
|
|||
Inherited from Node | |||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|
|||
|
|||
|
|
|||
Inherited from |
|||
Inherited from Node | |||
---|---|---|---|
_train_seq List of tuples: |
|||
dtype dtype |
|||
input_dim Input dimensions |
|||
output_dim Output dimensions |
|||
supported_dtypes Supported dtypes |
|
Logistic Regression (aka logit, MaxEnt) classifier. This node has been automatically generated by wrapping the ``sklearn.linear_model.logistic.LogisticRegression`` class from the ``sklearn`` library. The wrapped instance can be accessed through the ``scikits_alg`` attribute. In the multiclass case, the training algorithm uses the one-vs-rest (OvR) scheme if the 'multi_class' option is set to 'ovr' and uses the cross-entropy loss, if the 'multi_class' option is set to 'multinomial'. (Currently the 'multinomial' option is supported only by the 'lbfgs' and 'newton-cg' solvers.) This class implements regularized logistic regression using the `liblinear` library, newton-cg and lbfgs solvers. It can handle both dense and sparse input. Use C-ordered arrays or CSR matrices containing 64-bit floats for optimal performance; any other input format will be converted (and copied). The newton-cg and lbfgs solvers support only L2 regularization with primal formulation. The liblinear solver supports both L1 and L2 regularization, with a dual formulation only for the L2 penalty. Read more in the :ref:`User Guide <logistic_regression>`. **Parameters** penalty : str, 'l1' or 'l2' Used to specify the norm used in the penalization. The newton-cg and lbfgs solvers support only l2 penalties. dual : bool Dual or primal formulation. Dual formulation is only implemented for l2 penalty with liblinear solver. Prefer dual=False when n_samples > n_features. C : float, optional (default=1.0) Inverse of regularization strength; must be a positive float. Like in support vector machines, smaller values specify stronger regularization. fit_intercept : bool, default: True Specifies if a constant (a.k.a. bias or intercept) should be added to the decision function. intercept_scaling : float, default: 1 Useful only if solver is liblinear. when self.fit_intercept is True, instance vector x becomes [x, self.intercept_scaling], i.e. a "synthetic" feature with constant value equals to intercept_scaling is appended to the instance vector. The intercept becomes intercept_scaling * synthetic feature weight Note! the synthetic feature weight is subject to l1/l2 regularization as all other features. To lessen the effect of regularization on synthetic feature weight (and therefore on the intercept) intercept_scaling has to be increased. class_weight : dict or 'balanced', optional Weights associated with classes in the form ``{class_label: weight}``. If not given, all classes are supposed to have weight one. The "balanced" mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as ``n_samples / (n_classes * np.bincount(y))`` Note that these weights will be multiplied with sample_weight (passed through the fit method) if sample_weight is specified. .. versionadded:: 0.17 *class_weight='balanced'* instead of deprecated *class_weight='auto'*. max_iter : int Useful only for the newton-cg, sag and lbfgs solvers. Maximum number of iterations taken for the solvers to converge. random_state : int seed, RandomState instance, or None (default) The seed of the pseudo random number generator to use when shuffling the data. solver : {'newton-cg', 'lbfgs', 'liblinear', 'sag'} Algorithm to use in the optimization problem. - For small datasets, 'liblinear' is a good choice, whereas 'sag' is faster for large ones. - For multiclass problems, only 'newton-cg' and 'lbfgs' handle multinomial loss; 'sag' and 'liblinear' are limited to one-versus-rest schemes. - 'newton-cg', 'lbfgs' and 'sag' only handle L2 penalty. Note that 'sag' fast convergence is only guaranteed on features with approximately the same scale. You can preprocess the data with a scaler from sklearn.preprocessing. .. versionadded:: 0.17 Stochastic Average Gradient descent solver. tol : float, optional Tolerance for stopping criteria. multi_class : str, {'ovr', 'multinomial'} Multiclass option can be either 'ovr' or 'multinomial'. If the option chosen is 'ovr', then a binary problem is fit for each label. Else the loss minimised is the multinomial loss fit across the entire probability distribution. Works only for the 'lbfgs' solver. verbose : int For the liblinear and lbfgs solvers set verbose to any positive number for verbosity. warm_start : bool, optional When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution. Useless for liblinear solver. .. versionadded:: 0.17 *warm_start* to support *lbfgs*, *newton-cg*, *sag* solvers. n_jobs : int, optional Number of CPU cores used during the cross-validation loop. If given a value of -1, all cores are used. **Attributes** ``coef_`` : array, shape (n_classes, n_features) Coefficient of the features in the decision function. ``intercept_`` : array, shape (n_classes,) Intercept (a.k.a. bias) added to the decision function. If `fit_intercept` is set to False, the intercept is set to zero. ``n_iter_`` : array, shape (n_classes,) or (1, ) Actual number of iterations for all classes. If binary or multinomial, it returns only 1 element. For liblinear solver, only the maximum number of iteration across all classes is given. See also SGDClassifier : incrementally trained logistic regression (when given the parameter ``loss="log"``). sklearn.svm.LinearSVC : learns SVM models using the same algorithm. **Notes** The underlying C implementation uses a random number generator to select features when fitting the model. It is thus not uncommon, to have slightly different results for the same input data. If that happens, try with a smaller tol parameter. Predict output may not match that of standalone liblinear in certain cases. See :ref:`differences from liblinear <liblinear_differences>` in the narrative documentation. **References** LIBLINEAR -- A Library for Large Linear Classification http://www.csie.ntu.edu.tw/~cjlin/liblinear/ Hsiang-Fu Yu, Fang-Lan Huang, Chih-Jen Lin (2011). Dual coordinate descent methods for logistic regression and maximum entropy models. Machine Learning 85(1-2):41-75. http://www.csie.ntu.edu.tw/~cjlin/papers/maxent_dual.pdf
|
|
|
Transform the data and labels lists to array objects and reshape them.
|
|
|
Predict class labels for samples in X. This node has been automatically generated by wrapping the sklearn.linear_model.logistic.LogisticRegression class from the sklearn library. The wrapped instance can be accessed through the scikits_alg attribute. Parameters
Returns
|
Fit the model according to the given training data. This node has been automatically generated by wrapping the ``sklearn.linear_model.logistic.LogisticRegression`` class from the ``sklearn`` library. The wrapped instance can be accessed through the ``scikits_alg`` attribute. **Parameters** X : {array-like, sparse matrix}, shape (n_samples, n_features) Training vector, where n_samples in the number of samples and n_features is the number of features. y : array-like, shape (n_samples,) Target vector relative to X. sample_weight : array-like, shape (n_samples,) optional Array of weights that are assigned to individual samples. If not provided, then each sample is given unit weight. .. versionadded:: 0.17 *sample_weight* support to LogisticRegression. Returns self : object Returns self.
|
Home | Trees | Indices | Help |
|
---|
Generated by Epydoc 3.0.1 on Tue Mar 8 12:39:48 2016 | http://epydoc.sourceforge.net |