Package mdp :: Package nodes :: Class CalibratedClassifierCVScikitsLearnNode
[hide private]
[frames] | no frames]

Class CalibratedClassifierCVScikitsLearnNode



Probability calibration with isotonic regression or sigmoid.

This node has been automatically generated by wrapping the ``sklearn.calibration.CalibratedClassifierCV`` class
from the ``sklearn`` library.  The wrapped instance can be accessed
through the ``scikits_alg`` attribute.

With this class, the base_estimator is fit on the train set of the
cross-validation generator and the test set is used for calibration.
The probabilities for each of the folds are then averaged
for prediction. In case that cv="prefit" is passed to ``__init__``,
it is it is assumed that base_estimator has been
fitted already and all data is used for calibration. Note that
data for fitting the classifier and for calibrating it must be disjoint.

Read more in the :ref:`User Guide <calibration>`.

**Parameters**

base_estimator : instance BaseEstimator
    The classifier whose output decision function needs to be calibrated
    to offer more accurate predict_proba outputs. If cv=prefit, the
    classifier must have been fit already on data.

method : 'sigmoid' or 'isotonic'
    The method to use for calibration. Can be 'sigmoid' which
    corresponds to Platt's method or 'isotonic' which is a
    non-parameteric approach. It is not advised to use isotonic calibration
    with too few calibration samples ``(<<1000)`` since it tends to overfit.
    Use sigmoids (Platt's calibration) in this case.

cv : integer, cross-validation generator, iterable or "prefit", optional
    Determines the cross-validation splitting strategy.
    Possible inputs for cv are:


    - None, to use the default 3-fold cross-validation,
    - integer, to specify the number of folds.
    - An object to be used as a cross-validation generator.
    - An iterable yielding train/test splits.

    For integer/None inputs, if ``y`` is binary or multiclass,
    :class:`StratifiedKFold` used. If ``y`` is neither binary nor
    multiclass, :class:`KFold` is used.

    Refer :ref:`User Guide <cross_validation>` for the various
    cross-validation strategies that can be used here.

    If "prefit" is passed, it is assumed that base_estimator has been
    fitted already and all data is used for calibration.

**Attributes**

``classes_`` : array, shape (n_classes)
    The class labels.

calibrated_classifiers_: list (len() equal to cv or 1 if cv == "prefit")
    The list of calibrated classifiers, one for each crossvalidation fold,
    which has been fitted on all but the validation fold and calibrated
    on the validation fold.

**References**

.. [1] Obtaining calibrated probability estimates from decision trees
       and naive Bayesian classifiers, B. Zadrozny & C. Elkan, ICML 2001

.. [2] Transforming Classifier Scores into Accurate Multiclass
       Probability Estimates, B. Zadrozny & C. Elkan, (KDD 2002)

.. [3] Probabilistic Outputs for Support Vector Machines and Comparisons to
       Regularized Likelihood Methods, J. Platt, (1999)

.. [4] Predicting Good Probabilities with Supervised Learning,
       A. Niculescu-Mizil & R. Caruana, ICML 2005

Instance Methods [hide private]
 
__init__(self, input_dim=None, output_dim=None, dtype=None, **kwargs)
Probability calibration with isotonic regression or sigmoid.
 
_get_supported_dtypes(self)
Return the list of dtypes supported by this node. The types can be specified in any format allowed by numpy.dtype.
 
_label(self, x)
 
_stop_training(self, **kwargs)
Transform the data and labels lists to array objects and reshape them.
 
label(self, x)
Predict the target of new samples. Can be different from the prediction of the uncalibrated classifier.
 
stop_training(self, **kwargs)
Fit the calibrated model

Inherited from PreserveDimNode (private): _set_input_dim, _set_output_dim

Inherited from unreachable.newobject: __long__, __native__, __nonzero__, __unicode__, next

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __setattr__, __sizeof__, __subclasshook__

    Inherited from ClassifierCumulator
 
_check_train_args(self, x, labels)
 
_train(self, x, labels)
Cumulate all input data in a one dimensional list.
 
train(self, x, labels)
Cumulate all input data in a one dimensional list.
    Inherited from ClassifierNode
 
_execute(self, x)
 
_prob(self, x, *args, **kargs)
 
execute(self, x)
Process the data contained in x.
 
prob(self, x, *args, **kwargs)
Predict probability for each possible outcome.
 
rank(self, x, threshold=None)
Returns ordered list with all labels ordered according to prob(x) (e.g., [[3 1 2], [2 1 3], ...]).
    Inherited from Node
 
__add__(self, other)
 
__call__(self, x, *args, **kwargs)
Calling an instance of Node is equivalent to calling its execute method.
 
__repr__(self)
repr(x)
 
__str__(self)
str(x)
 
_check_input(self, x)
 
_check_output(self, y)
 
_get_train_seq(self)
 
_if_training_stop_training(self)
 
_inverse(self, x)
 
_pre_execution_checks(self, x)
This method contains all pre-execution checks.
 
_pre_inversion_checks(self, y)
This method contains all pre-inversion checks.
 
_refcast(self, x)
Helper function to cast arrays to the internal dtype.
 
_set_dtype(self, t)
 
copy(self, protocol=None)
Return a deep copy of the node.
 
get_current_train_phase(self)
Return the index of the current training phase.
 
get_dtype(self)
Return dtype.
 
get_input_dim(self)
Return input dimensions.
 
get_output_dim(self)
Return output dimensions.
 
get_remaining_train_phase(self)
Return the number of training phases still to accomplish.
 
get_supported_dtypes(self)
Return dtypes supported by the node as a list of dtype objects.
 
has_multiple_training_phases(self)
Return True if the node has multiple training phases.
 
inverse(self, y, *args, **kwargs)
Invert y.
 
is_training(self)
Return True if the node is in the training phase, False otherwise.
 
save(self, filename, protocol=-1)
Save a pickled serialization of the node to filename. If filename is None, return a string.
 
set_dtype(self, t)
Set internal structures' dtype.
 
set_input_dim(self, n)
Set input dimensions.
 
set_output_dim(self, n)
Set output dimensions.
Static Methods [hide private]
 
is_invertible()
Return True if the node can be inverted, False otherwise.
 
is_trainable()
Return True if the node can be trained, False otherwise.
Properties [hide private]

Inherited from object: __class__

    Inherited from Node
  _train_seq
List of tuples:
  dtype
dtype
  input_dim
Input dimensions
  output_dim
Output dimensions
  supported_dtypes
Supported dtypes
Method Details [hide private]

__init__(self, input_dim=None, output_dim=None, dtype=None, **kwargs)
(Constructor)

 

Probability calibration with isotonic regression or sigmoid.

This node has been automatically generated by wrapping the ``sklearn.calibration.CalibratedClassifierCV`` class
from the ``sklearn`` library.  The wrapped instance can be accessed
through the ``scikits_alg`` attribute.

With this class, the base_estimator is fit on the train set of the
cross-validation generator and the test set is used for calibration.
The probabilities for each of the folds are then averaged
for prediction. In case that cv="prefit" is passed to ``__init__``,
it is it is assumed that base_estimator has been
fitted already and all data is used for calibration. Note that
data for fitting the classifier and for calibrating it must be disjoint.

Read more in the :ref:`User Guide <calibration>`.

**Parameters**

base_estimator : instance BaseEstimator
    The classifier whose output decision function needs to be calibrated
    to offer more accurate predict_proba outputs. If cv=prefit, the
    classifier must have been fit already on data.

method : 'sigmoid' or 'isotonic'
    The method to use for calibration. Can be 'sigmoid' which
    corresponds to Platt's method or 'isotonic' which is a
    non-parameteric approach. It is not advised to use isotonic calibration
    with too few calibration samples ``(<<1000)`` since it tends to overfit.
    Use sigmoids (Platt's calibration) in this case.

cv : integer, cross-validation generator, iterable or "prefit", optional
    Determines the cross-validation splitting strategy.
    Possible inputs for cv are:


    - None, to use the default 3-fold cross-validation,
    - integer, to specify the number of folds.
    - An object to be used as a cross-validation generator.
    - An iterable yielding train/test splits.

    For integer/None inputs, if ``y`` is binary or multiclass,
    :class:`StratifiedKFold` used. If ``y`` is neither binary nor
    multiclass, :class:`KFold` is used.

    Refer :ref:`User Guide <cross_validation>` for the various
    cross-validation strategies that can be used here.

    If "prefit" is passed, it is assumed that base_estimator has been
    fitted already and all data is used for calibration.

**Attributes**

``classes_`` : array, shape (n_classes)
    The class labels.

calibrated_classifiers_: list (len() equal to cv or 1 if cv == "prefit")
    The list of calibrated classifiers, one for each crossvalidation fold,
    which has been fitted on all but the validation fold and calibrated
    on the validation fold.

**References**

.. [1] Obtaining calibrated probability estimates from decision trees
       and naive Bayesian classifiers, B. Zadrozny & C. Elkan, ICML 2001

.. [2] Transforming Classifier Scores into Accurate Multiclass
       Probability Estimates, B. Zadrozny & C. Elkan, (KDD 2002)

.. [3] Probabilistic Outputs for Support Vector Machines and Comparisons to
       Regularized Likelihood Methods, J. Platt, (1999)

.. [4] Predicting Good Probabilities with Supervised Learning,
       A. Niculescu-Mizil & R. Caruana, ICML 2005

Overrides: object.__init__

_get_supported_dtypes(self)

 
Return the list of dtypes supported by this node. The types can be specified in any format allowed by numpy.dtype.
Overrides: Node._get_supported_dtypes

_label(self, x)

 
Overrides: ClassifierNode._label

_stop_training(self, **kwargs)

 
Transform the data and labels lists to array objects and reshape them.

Overrides: Node._stop_training

is_invertible()
Static Method

 
Return True if the node can be inverted, False otherwise.
Overrides: Node.is_invertible
(inherited documentation)

is_trainable()
Static Method

 
Return True if the node can be trained, False otherwise.
Overrides: Node.is_trainable

label(self, x)

 

Predict the target of new samples. Can be different from the prediction of the uncalibrated classifier.

This node has been automatically generated by wrapping the sklearn.calibration.CalibratedClassifierCV class from the sklearn library. The wrapped instance can be accessed through the scikits_alg attribute.

Parameters

X : array-like, shape (n_samples, n_features)
The samples.

Returns

C : array, shape (n_samples,)
The predicted class.
Overrides: ClassifierNode.label

stop_training(self, **kwargs)

 

Fit the calibrated model

This node has been automatically generated by wrapping the sklearn.calibration.CalibratedClassifierCV class from the sklearn library. The wrapped instance can be accessed through the scikits_alg attribute.

Parameters

X : array-like, shape (n_samples, n_features)
Training data.
y : array-like, shape (n_samples,)
Target values.
sample_weight : array-like, shape = [n_samples] or None
Sample weights. If None, then samples are equally weighted.

Returns

self : object
Returns an instance of self.
Overrides: Node.stop_training