| Home | Trees | Indices | Help |
|
|---|
|
|
Logistic Regression CV (aka logit, MaxEnt) classifier.
This node has been automatically generated by wrapping the ``sklearn.linear_model.logistic.LogisticRegressionCV`` class
from the ``sklearn`` library. The wrapped instance can be accessed
through the ``scikits_alg`` attribute.
This class implements logistic regression using liblinear, newton-cg, sag
of lbfgs optimizer. The newton-cg, sag and lbfgs solvers support only L2
regularization with primal formulation. The liblinear solver supports both
L1 and L2 regularization, with a dual formulation only for the L2 penalty.
For the grid of Cs values (that are set by default to be ten values in
a logarithmic scale between 1e-4 and 1e4), the best hyperparameter is
selected by the cross-validator StratifiedKFold, but it can be changed
using the cv parameter. In the case of newton-cg and lbfgs solvers,
we warm start along the path i.e guess the initial coefficients of the
present fit to be the coefficients got after convergence in the previous
fit, so it is supposed to be faster for high-dimensional dense data.
For a multiclass problem, the hyperparameters for each class are computed
using the best scores got by doing a one-vs-rest in parallel across all
folds and classes. Hence this is not the true multinomial loss.
Read more in the :ref:`User Guide <logistic_regression>`.
**Parameters**
Cs : list of floats | int
Each of the values in Cs describes the inverse of regularization
strength. If Cs is as an int, then a grid of Cs values are chosen
in a logarithmic scale between 1e-4 and 1e4.
Like in support vector machines, smaller values specify stronger
regularization.
fit_intercept : bool, default: True
Specifies if a constant (a.k.a. bias or intercept) should be
added to the decision function.
class_weight : dict or 'balanced', optional
Weights associated with classes in the form ``{class_label: weight}``.
If not given, all classes are supposed to have weight one.
The "balanced" mode uses the values of y to automatically adjust
weights inversely proportional to class frequencies in the input data
as ``n_samples / (n_classes * np.bincount(y))``
Note that these weights will be multiplied with sample_weight (passed
through the fit method) if sample_weight is specified.
.. versionadded:: 0.17
class_weight == 'balanced'
cv : integer or cross-validation generator
The default cross-validation generator used is Stratified K-Folds.
If an integer is provided, then it is the number of folds used.
See the module :mod:`sklearn.cross_validation` module for the
list of possible cross-validation objects.
penalty : str, 'l1' or 'l2'
Used to specify the norm used in the penalization. The newton-cg and
lbfgs solvers support only l2 penalties.
dual : bool
Dual or primal formulation. Dual formulation is only implemented for
l2 penalty with liblinear solver. Prefer dual=False when
n_samples > n_features.
scoring : callabale
Scoring function to use as cross-validation criteria. For a list of
scoring functions that can be used, look at :mod:`sklearn.metrics`.
The default scoring option used is accuracy_score.
solver : {'newton-cg', 'lbfgs', 'liblinear', 'sag'}
Algorithm to use in the optimization problem.
- For small datasets, 'liblinear' is a good choice, whereas 'sag' is
faster for large ones.
- For multiclass problems, only 'newton-cg' and 'lbfgs' handle
multinomial loss; 'sag' and 'liblinear' are limited to
one-versus-rest schemes.
- 'newton-cg', 'lbfgs' and 'sag' only handle L2 penalty.
- 'liblinear' might be slower in LogisticRegressionCV because it does
not handle warm-starting.
tol : float, optional
Tolerance for stopping criteria.
max_iter : int, optional
Maximum number of iterations of the optimization algorithm.
n_jobs : int, optional
Number of CPU cores used during the cross-validation loop. If given
a value of -1, all cores are used.
verbose : int
For the 'liblinear', 'sag' and 'lbfgs' solvers set verbose to any
positive number for verbosity.
refit : bool
If set to True, the scores are averaged across all folds, and the
coefs and the C that corresponds to the best score is taken, and a
final refit is done using these parameters.
Otherwise the coefs, intercepts and C that correspond to the
best scores across folds are averaged.
multi_class : str, {'ovr', 'multinomial'}
Multiclass option can be either 'ovr' or 'multinomial'. If the option
chosen is 'ovr', then a binary problem is fit for each label. Else
the loss minimised is the multinomial loss fit across
the entire probability distribution. Works only for 'lbfgs' and
'newton-cg' solvers.
intercept_scaling : float, default 1.
Useful only if solver is liblinear.
This parameter is useful only when the solver 'liblinear' is used
and self.fit_intercept is set to True. In this case, x becomes
[x, self.intercept_scaling],
i.e. a "synthetic" feature with constant value equals to
intercept_scaling is appended to the instance vector.
The intercept becomes intercept_scaling * synthetic feature weight
Note! the synthetic feature weight is subject to l1/l2 regularization
as all other features.
To lessen the effect of regularization on synthetic feature weight
(and therefore on the intercept) intercept_scaling has to be increased.
random_state : int seed, RandomState instance, or None (default)
The seed of the pseudo random number generator to use when
shuffling the data.
**Attributes**
``coef_`` : array, shape (1, n_features) or (n_classes, n_features)
Coefficient of the features in the decision function.
`coef_` is of shape (1, n_features) when the given problem
is binary.
`coef_` is readonly property derived from `raw_coef_` that
follows the internal memory layout of liblinear.
``intercept_`` : array, shape (1,) or (n_classes,)
Intercept (a.k.a. bias) added to the decision function.
It is available only when parameter intercept is set to True
and is of shape(1,) when the problem is binary.
``Cs_`` : array
Array of C i.e. inverse of regularization parameter values used
for cross-validation.
``coefs_paths_`` : array, shape ``(n_folds, len(Cs_), n_features)`` or ``(n_folds, len(Cs_), n_features + 1)``
dict with classes as the keys, and the path of coefficients obtained
during cross-validating across each fold and then across each Cs
after doing an OvR for the corresponding class as values.
If the 'multi_class' option is set to 'multinomial', then
the coefs_paths are the coefficients corresponding to each class.
Each dict value has shape ``(n_folds, len(Cs_), n_features)`` or
``(n_folds, len(Cs_), n_features + 1)`` depending on whether the
intercept is fit or not.
``scores_`` : dict
dict with classes as the keys, and the values as the
grid of scores obtained during cross-validating each fold, after doing
an OvR for the corresponding class. If the 'multi_class' option
given is 'multinomial' then the same scores are repeated across
all classes, since this is the multinomial class.
Each dict value has shape (n_folds, len(Cs))
``C_`` : array, shape (n_classes,) or (n_classes - 1,)
Array of C that maps to the best scores across every class. If refit is
set to False, then for each class, the best C is the average of the
C's that correspond to the best scores for each fold.
``n_iter_`` : array, shape (n_classes, n_folds, n_cs) or (1, n_folds, n_cs)
Actual number of iterations for all classes, folds and Cs.
In the binary or multinomial cases, the first dimension is equal to 1.
See also
LogisticRegression
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
Inherited from Inherited from Inherited from |
|||
| Inherited from ClassifierCumulator | |||
|---|---|---|---|
|
|||
|
|||
|
|||
| Inherited from ClassifierNode | |||
|
|||
|
|||
|
|||
|
|||
|
|||
| Inherited from Node | |||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
Inherited from |
|||
| Inherited from Node | |||
|---|---|---|---|
|
_train_seq List of tuples: |
|||
|
dtype dtype |
|||
|
input_dim Input dimensions |
|||
|
output_dim Output dimensions |
|||
|
supported_dtypes Supported dtypes |
|||
|
|||
Logistic Regression CV (aka logit, MaxEnt) classifier.
This node has been automatically generated by wrapping the ``sklearn.linear_model.logistic.LogisticRegressionCV`` class
from the ``sklearn`` library. The wrapped instance can be accessed
through the ``scikits_alg`` attribute.
This class implements logistic regression using liblinear, newton-cg, sag
of lbfgs optimizer. The newton-cg, sag and lbfgs solvers support only L2
regularization with primal formulation. The liblinear solver supports both
L1 and L2 regularization, with a dual formulation only for the L2 penalty.
For the grid of Cs values (that are set by default to be ten values in
a logarithmic scale between 1e-4 and 1e4), the best hyperparameter is
selected by the cross-validator StratifiedKFold, but it can be changed
using the cv parameter. In the case of newton-cg and lbfgs solvers,
we warm start along the path i.e guess the initial coefficients of the
present fit to be the coefficients got after convergence in the previous
fit, so it is supposed to be faster for high-dimensional dense data.
For a multiclass problem, the hyperparameters for each class are computed
using the best scores got by doing a one-vs-rest in parallel across all
folds and classes. Hence this is not the true multinomial loss.
Read more in the :ref:`User Guide <logistic_regression>`.
**Parameters**
Cs : list of floats | int
Each of the values in Cs describes the inverse of regularization
strength. If Cs is as an int, then a grid of Cs values are chosen
in a logarithmic scale between 1e-4 and 1e4.
Like in support vector machines, smaller values specify stronger
regularization.
fit_intercept : bool, default: True
Specifies if a constant (a.k.a. bias or intercept) should be
added to the decision function.
class_weight : dict or 'balanced', optional
Weights associated with classes in the form ``{class_label: weight}``.
If not given, all classes are supposed to have weight one.
The "balanced" mode uses the values of y to automatically adjust
weights inversely proportional to class frequencies in the input data
as ``n_samples / (n_classes * np.bincount(y))``
Note that these weights will be multiplied with sample_weight (passed
through the fit method) if sample_weight is specified.
.. versionadded:: 0.17
class_weight == 'balanced'
cv : integer or cross-validation generator
The default cross-validation generator used is Stratified K-Folds.
If an integer is provided, then it is the number of folds used.
See the module :mod:`sklearn.cross_validation` module for the
list of possible cross-validation objects.
penalty : str, 'l1' or 'l2'
Used to specify the norm used in the penalization. The newton-cg and
lbfgs solvers support only l2 penalties.
dual : bool
Dual or primal formulation. Dual formulation is only implemented for
l2 penalty with liblinear solver. Prefer dual=False when
n_samples > n_features.
scoring : callabale
Scoring function to use as cross-validation criteria. For a list of
scoring functions that can be used, look at :mod:`sklearn.metrics`.
The default scoring option used is accuracy_score.
solver : {'newton-cg', 'lbfgs', 'liblinear', 'sag'}
Algorithm to use in the optimization problem.
- For small datasets, 'liblinear' is a good choice, whereas 'sag' is
faster for large ones.
- For multiclass problems, only 'newton-cg' and 'lbfgs' handle
multinomial loss; 'sag' and 'liblinear' are limited to
one-versus-rest schemes.
- 'newton-cg', 'lbfgs' and 'sag' only handle L2 penalty.
- 'liblinear' might be slower in LogisticRegressionCV because it does
not handle warm-starting.
tol : float, optional
Tolerance for stopping criteria.
max_iter : int, optional
Maximum number of iterations of the optimization algorithm.
n_jobs : int, optional
Number of CPU cores used during the cross-validation loop. If given
a value of -1, all cores are used.
verbose : int
For the 'liblinear', 'sag' and 'lbfgs' solvers set verbose to any
positive number for verbosity.
refit : bool
If set to True, the scores are averaged across all folds, and the
coefs and the C that corresponds to the best score is taken, and a
final refit is done using these parameters.
Otherwise the coefs, intercepts and C that correspond to the
best scores across folds are averaged.
multi_class : str, {'ovr', 'multinomial'}
Multiclass option can be either 'ovr' or 'multinomial'. If the option
chosen is 'ovr', then a binary problem is fit for each label. Else
the loss minimised is the multinomial loss fit across
the entire probability distribution. Works only for 'lbfgs' and
'newton-cg' solvers.
intercept_scaling : float, default 1.
Useful only if solver is liblinear.
This parameter is useful only when the solver 'liblinear' is used
and self.fit_intercept is set to True. In this case, x becomes
[x, self.intercept_scaling],
i.e. a "synthetic" feature with constant value equals to
intercept_scaling is appended to the instance vector.
The intercept becomes intercept_scaling * synthetic feature weight
Note! the synthetic feature weight is subject to l1/l2 regularization
as all other features.
To lessen the effect of regularization on synthetic feature weight
(and therefore on the intercept) intercept_scaling has to be increased.
random_state : int seed, RandomState instance, or None (default)
The seed of the pseudo random number generator to use when
shuffling the data.
**Attributes**
``coef_`` : array, shape (1, n_features) or (n_classes, n_features)
Coefficient of the features in the decision function.
`coef_` is of shape (1, n_features) when the given problem
is binary.
`coef_` is readonly property derived from `raw_coef_` that
follows the internal memory layout of liblinear.
``intercept_`` : array, shape (1,) or (n_classes,)
Intercept (a.k.a. bias) added to the decision function.
It is available only when parameter intercept is set to True
and is of shape(1,) when the problem is binary.
``Cs_`` : array
Array of C i.e. inverse of regularization parameter values used
for cross-validation.
``coefs_paths_`` : array, shape ``(n_folds, len(Cs_), n_features)`` or ``(n_folds, len(Cs_), n_features + 1)``
dict with classes as the keys, and the path of coefficients obtained
during cross-validating across each fold and then across each Cs
after doing an OvR for the corresponding class as values.
If the 'multi_class' option is set to 'multinomial', then
the coefs_paths are the coefficients corresponding to each class.
Each dict value has shape ``(n_folds, len(Cs_), n_features)`` or
``(n_folds, len(Cs_), n_features + 1)`` depending on whether the
intercept is fit or not.
``scores_`` : dict
dict with classes as the keys, and the values as the
grid of scores obtained during cross-validating each fold, after doing
an OvR for the corresponding class. If the 'multi_class' option
given is 'multinomial' then the same scores are repeated across
all classes, since this is the multinomial class.
Each dict value has shape (n_folds, len(Cs))
``C_`` : array, shape (n_classes,) or (n_classes - 1,)
Array of C that maps to the best scores across every class. If refit is
set to False, then for each class, the best C is the average of the
C's that correspond to the best scores for each fold.
``n_iter_`` : array, shape (n_classes, n_folds, n_cs) or (1, n_folds, n_cs)
Actual number of iterations for all classes, folds and Cs.
In the binary or multinomial cases, the first dimension is equal to 1.
See also
LogisticRegression
|
|
|
Transform the data and labels lists to array objects and reshape them.
|
|
|
Predict class labels for samples in X. This node has been automatically generated by wrapping the sklearn.linear_model.logistic.LogisticRegressionCV class from the sklearn library. The wrapped instance can be accessed through the scikits_alg attribute. Parameters
Returns
|
Fit the model according to the given training data. This node has been automatically generated by wrapping the sklearn.linear_model.logistic.LogisticRegressionCV class from the sklearn library. The wrapped instance can be accessed through the scikits_alg attribute. Parameters
Returns
|
| Home | Trees | Indices | Help |
|
|---|
| Generated by Epydoc 3.0.1 on Tue Mar 8 12:39:48 2016 | http://epydoc.sourceforge.net |