Package mdp :: Package nodes :: Class RFEScikitsLearnNode
[hide private]
[frames] | no frames]

Class RFEScikitsLearnNode



Feature ranking with recursive feature elimination.

This node has been automatically generated by wrapping the ``sklearn.feature_selection.rfe.RFE`` class
from the ``sklearn`` library.  The wrapped instance can be accessed
through the ``scikits_alg`` attribute.

Given an external estimator that assigns weights to features (e.g., the
coefficients of a linear model), the goal of recursive feature elimination
(RFE) is to select features by recursively considering smaller and smaller
sets of features. First, the estimator is trained on the initial set of
features and weights are assigned to each one of them. Then, features whose
absolute weights are the smallest are pruned from the current set features.
That procedure is recursively repeated on the pruned set until the desired
number of features to select is eventually reached.

Read more in the :ref:`User Guide <rfe>`.

**Parameters**

estimator : object
    A supervised learning estimator with a `fit` method that updates a
    `coef_` attribute that holds the fitted parameters. Important features
    must correspond to high absolute values in the `coef_` array.

    For instance, this is the case for most supervised learning
    algorithms such as Support Vector Classifiers and Generalized
    Linear Models from the `svm` and `linear_model` modules.

n_features_to_select : int or None (default=None)
    The number of features to select. If `None`, half of the features
    are selected.

step : int or float, optional (default=1)
    If greater than or equal to 1, then `step` corresponds to the (integer)
    number of features to remove at each iteration.
    If within (0.0, 1.0), then `step` corresponds to the percentage
    (rounded down) of features to remove at each iteration.

estimator_params : dict
    Parameters for the external estimator.
    This attribute is deprecated as of version 0.16 and will be removed in
    0.18. Use estimator initialisation or set_params method instead.

verbose : int, default=0
    Controls verbosity of output.

**Attributes**

``n_features_`` : int
    The number of selected features.

``support_`` : array of shape [n_features]
    The mask of selected features.

``ranking_`` : array of shape [n_features]
    The feature ranking, such that ``ranking_[i]`` corresponds to the
    ranking position of the i-th feature. Selected (i.e., estimated
    best) features are assigned rank 1.

``estimator_`` : object
    The external estimator fit on the reduced dataset.

**Examples**

The following example shows how to retrieve the 5 right informative
features in the Friedman #1 dataset.

>>> from sklearn.datasets import make_friedman1
>>> from sklearn.feature_selection import RFE
>>> from sklearn.svm import SVR
>>> X, y = make_friedman1(n_samples=50, n_features=10, random_state=0)
>>> estimator = SVR(kernel="linear")
>>> selector = RFE(estimator, 5, step=1)
>>> selector = selector.fit(X, y)
>>> selector.support_ # doctest: +NORMALIZE_WHITESPACE
array([ True,  True,  True,  True,  True,
        False, False, False, False, False], dtype=bool)
>>> selector.ranking_
array([1, 1, 1, 1, 1, 6, 4, 3, 2, 5])

**References**


.. [1] Guyon, I., Weston, J., Barnhill, S., & Vapnik, V., "Gene selection
       for cancer classification using support vector machines",
       Mach. Learn., 46(1-3), 389--422, 2002.

Instance Methods [hide private]
 
__init__(self, input_dim=None, output_dim=None, dtype=None, **kwargs)
Feature ranking with recursive feature elimination.
 
_execute(self, x)
 
_get_supported_dtypes(self)
Return the list of dtypes supported by this node. The types can be specified in any format allowed by numpy.dtype.
 
_stop_training(self, **kwargs)
Concatenate the collected data in a single array.
 
execute(self, x)
Reduce X to the selected features.
 
stop_training(self, **kwargs)
Fit the RFE model and then the underlying estimator on the selected features.

Inherited from unreachable.newobject: __long__, __native__, __nonzero__, __unicode__, next

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __setattr__, __sizeof__, __subclasshook__

    Inherited from Cumulator
 
_train(self, *args)
Collect all input data in a list.
 
train(self, *args)
Collect all input data in a list.
    Inherited from Node
 
__add__(self, other)
 
__call__(self, x, *args, **kwargs)
Calling an instance of Node is equivalent to calling its execute method.
 
__repr__(self)
repr(x)
 
__str__(self)
str(x)
 
_check_input(self, x)
 
_check_output(self, y)
 
_check_train_args(self, x, *args, **kwargs)
 
_get_train_seq(self)
 
_if_training_stop_training(self)
 
_inverse(self, x)
 
_pre_execution_checks(self, x)
This method contains all pre-execution checks.
 
_pre_inversion_checks(self, y)
This method contains all pre-inversion checks.
 
_refcast(self, x)
Helper function to cast arrays to the internal dtype.
 
_set_dtype(self, t)
 
_set_input_dim(self, n)
 
_set_output_dim(self, n)
 
copy(self, protocol=None)
Return a deep copy of the node.
 
get_current_train_phase(self)
Return the index of the current training phase.
 
get_dtype(self)
Return dtype.
 
get_input_dim(self)
Return input dimensions.
 
get_output_dim(self)
Return output dimensions.
 
get_remaining_train_phase(self)
Return the number of training phases still to accomplish.
 
get_supported_dtypes(self)
Return dtypes supported by the node as a list of dtype objects.
 
has_multiple_training_phases(self)
Return True if the node has multiple training phases.
 
inverse(self, y, *args, **kwargs)
Invert y.
 
is_training(self)
Return True if the node is in the training phase, False otherwise.
 
save(self, filename, protocol=-1)
Save a pickled serialization of the node to filename. If filename is None, return a string.
 
set_dtype(self, t)
Set internal structures' dtype.
 
set_input_dim(self, n)
Set input dimensions.
 
set_output_dim(self, n)
Set output dimensions.
Static Methods [hide private]
 
is_invertible()
Return True if the node can be inverted, False otherwise.
 
is_trainable()
Return True if the node can be trained, False otherwise.
Properties [hide private]

Inherited from object: __class__

    Inherited from Node
  _train_seq
List of tuples:
  dtype
dtype
  input_dim
Input dimensions
  output_dim
Output dimensions
  supported_dtypes
Supported dtypes
Method Details [hide private]

__init__(self, input_dim=None, output_dim=None, dtype=None, **kwargs)
(Constructor)

 

Feature ranking with recursive feature elimination.

This node has been automatically generated by wrapping the ``sklearn.feature_selection.rfe.RFE`` class
from the ``sklearn`` library.  The wrapped instance can be accessed
through the ``scikits_alg`` attribute.

Given an external estimator that assigns weights to features (e.g., the
coefficients of a linear model), the goal of recursive feature elimination
(RFE) is to select features by recursively considering smaller and smaller
sets of features. First, the estimator is trained on the initial set of
features and weights are assigned to each one of them. Then, features whose
absolute weights are the smallest are pruned from the current set features.
That procedure is recursively repeated on the pruned set until the desired
number of features to select is eventually reached.

Read more in the :ref:`User Guide <rfe>`.

**Parameters**

estimator : object
    A supervised learning estimator with a `fit` method that updates a
    `coef_` attribute that holds the fitted parameters. Important features
    must correspond to high absolute values in the `coef_` array.

    For instance, this is the case for most supervised learning
    algorithms such as Support Vector Classifiers and Generalized
    Linear Models from the `svm` and `linear_model` modules.

n_features_to_select : int or None (default=None)
    The number of features to select. If `None`, half of the features
    are selected.

step : int or float, optional (default=1)
    If greater than or equal to 1, then `step` corresponds to the (integer)
    number of features to remove at each iteration.
    If within (0.0, 1.0), then `step` corresponds to the percentage
    (rounded down) of features to remove at each iteration.

estimator_params : dict
    Parameters for the external estimator.
    This attribute is deprecated as of version 0.16 and will be removed in
    0.18. Use estimator initialisation or set_params method instead.

verbose : int, default=0
    Controls verbosity of output.

**Attributes**

``n_features_`` : int
    The number of selected features.

``support_`` : array of shape [n_features]
    The mask of selected features.

``ranking_`` : array of shape [n_features]
    The feature ranking, such that ``ranking_[i]`` corresponds to the
    ranking position of the i-th feature. Selected (i.e., estimated
    best) features are assigned rank 1.

``estimator_`` : object
    The external estimator fit on the reduced dataset.

**Examples**

The following example shows how to retrieve the 5 right informative
features in the Friedman #1 dataset.

>>> from sklearn.datasets import make_friedman1
>>> from sklearn.feature_selection import RFE
>>> from sklearn.svm import SVR
>>> X, y = make_friedman1(n_samples=50, n_features=10, random_state=0)
>>> estimator = SVR(kernel="linear")
>>> selector = RFE(estimator, 5, step=1)
>>> selector = selector.fit(X, y)
>>> selector.support_ # doctest: +NORMALIZE_WHITESPACE
array([ True,  True,  True,  True,  True,
        False, False, False, False, False], dtype=bool)
>>> selector.ranking_
array([1, 1, 1, 1, 1, 6, 4, 3, 2, 5])

**References**


.. [1] Guyon, I., Weston, J., Barnhill, S., & Vapnik, V., "Gene selection
       for cancer classification using support vector machines",
       Mach. Learn., 46(1-3), 389--422, 2002.

Overrides: object.__init__

_execute(self, x)

 
Overrides: Node._execute

_get_supported_dtypes(self)

 
Return the list of dtypes supported by this node. The types can be specified in any format allowed by numpy.dtype.
Overrides: Node._get_supported_dtypes

_stop_training(self, **kwargs)

 
Concatenate the collected data in a single array.
Overrides: Node._stop_training

execute(self, x)

 

Reduce X to the selected features.

This node has been automatically generated by wrapping the sklearn.feature_selection.rfe.RFE class from the sklearn library. The wrapped instance can be accessed through the scikits_alg attribute.

Parameters

X : array of shape [n_samples, n_features]
The input samples.

Returns

X_r : array of shape [n_samples, n_selected_features]
The input samples with only the selected features.
Overrides: Node.execute

is_invertible()
Static Method

 
Return True if the node can be inverted, False otherwise.
Overrides: Node.is_invertible
(inherited documentation)

is_trainable()
Static Method

 
Return True if the node can be trained, False otherwise.
Overrides: Node.is_trainable

stop_training(self, **kwargs)

 

Fit the RFE model and then the underlying estimator on the selected features.

This node has been automatically generated by wrapping the sklearn.feature_selection.rfe.RFE class from the sklearn library. The wrapped instance can be accessed through the scikits_alg attribute.

Parameters

X : {array-like, sparse matrix}, shape = [n_samples, n_features]
The training input samples.
y : array-like, shape = [n_samples]
The target values.
Overrides: Node.stop_training