Package mdp :: Package nodes :: Class RandomizedLassoScikitsLearnNode
[hide private]
[frames] | no frames]

Class RandomizedLassoScikitsLearnNode



Randomized Lasso.

This node has been automatically generated by wrapping the ``sklearn.linear_model.randomized_l1.RandomizedLasso`` class
from the ``sklearn`` library.  The wrapped instance can be accessed
through the ``scikits_alg`` attribute.

Randomized Lasso works by resampling the train data and computing
a Lasso on each resampling. In short, the features selected more
often are good features. It is also known as stability selection.

Read more in the :ref:`User Guide <randomized_l1>`.

**Parameters**

alpha : float, 'aic', or 'bic', optional
    The regularization parameter alpha parameter in the Lasso.
    Warning: this is not the alpha parameter in the stability selection
    article which is scaling.

scaling : float, optional
    The alpha parameter in the stability selection article used to
    randomly scale the features. Should be between 0 and 1.

sample_fraction : float, optional
    The fraction of samples to be used in each randomized design.
    Should be between 0 and 1. If 1, all samples are used.

n_resampling : int, optional
    Number of randomized models.

selection_threshold: float, optional
    The score above which features should be selected.

fit_intercept : boolean, optional
    whether to calculate the intercept for this model. If set
    to false, no intercept will be used in calculations
    (e.g. data is expected to be already centered).

verbose : boolean or integer, optional
    Sets the verbosity amount

normalize : boolean, optional, default True
    If True, the regressors X will be normalized before regression.

precompute : True | False | 'auto'
    Whether to use a precomputed Gram matrix to speed up
    calculations. If set to 'auto' let us decide. The Gram
    matrix can also be passed as argument.

max_iter : integer, optional
    Maximum number of iterations to perform in the Lars algorithm.

eps : float, optional
    The machine-precision regularization in the computation of the
    Cholesky diagonal factors. Increase this for very ill-conditioned
    systems. Unlike the 'tol' parameter in some iterative
    optimization-based algorithms, this parameter does not control
    the tolerance of the optimization.

n_jobs : integer, optional
    Number of CPUs to use during the resampling. If '-1', use
    all the CPUs

random_state : int, RandomState instance or None, optional (default=None)
    If int, random_state is the seed used by the random number generator;
    If RandomState instance, random_state is the random number generator;
    If None, the random number generator is the RandomState instance used
    by `np.random`.

pre_dispatch : int, or string, optional
    Controls the number of jobs that get dispatched during parallel
    execution. Reducing this number can be useful to avoid an
    explosion of memory consumption when more jobs get dispatched
    than CPUs can process. This parameter can be:


        - None, in which case all the jobs are immediately
          created and spawned. Use this for lightweight and
          fast-running jobs, to avoid delays due to on-demand
          spawning of the jobs

        - An int, giving the exact number of total jobs that are
          spawned

        - A string, giving an expression as a function of n_jobs,
          as in '2*n_jobs'

memory : Instance of joblib.Memory or string
    Used for internal caching. By default, no caching is done.
    If a string is given, it is the path to the caching directory.

**Attributes**

``scores_`` : array, shape = [n_features]
    Feature scores between 0 and 1.

``all_scores_`` : array, shape = [n_features, n_reg_parameter]
    Feature scores between 0 and 1 for all values of the regularization         parameter. The reference article suggests ``scores_`` is the max of         ``all_scores_``.

**Examples**

>>> from sklearn.linear_model import RandomizedLasso
>>> randomized_lasso = RandomizedLasso()

**Notes**

See examples/linear_model/plot_sparse_recovery.py for an example.

**References**

Stability selection
Nicolai Meinshausen, Peter Buhlmann
Journal of the Royal Statistical Society: Series B
Volume 72, Issue 4, pages 417-473, September 2010
DOI: 10.1111/j.1467-9868.2010.00740.x

See also

RandomizedLogisticRegression, LogisticRegression

Instance Methods [hide private]
 
__init__(self, input_dim=None, output_dim=None, dtype=None, **kwargs)
Randomized Lasso.
 
_execute(self, x)
 
_get_supported_dtypes(self)
Return the list of dtypes supported by this node. The types can be specified in any format allowed by numpy.dtype.
 
_stop_training(self, **kwargs)
Concatenate the collected data in a single array.
 
execute(self, x)
This node has been automatically generated by wrapping the sklearn.linear_model.randomized_l1.RandomizedLasso class from the sklearn library. The wrapped instance can be accessed through the scikits_alg attribute.
 
stop_training(self, **kwargs)
Fit the model using X, y as training data.

Inherited from unreachable.newobject: __long__, __native__, __nonzero__, __unicode__, next

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __setattr__, __sizeof__, __subclasshook__

    Inherited from Cumulator
 
_train(self, *args)
Collect all input data in a list.
 
train(self, *args)
Collect all input data in a list.
    Inherited from Node
 
__add__(self, other)
 
__call__(self, x, *args, **kwargs)
Calling an instance of Node is equivalent to calling its execute method.
 
__repr__(self)
repr(x)
 
__str__(self)
str(x)
 
_check_input(self, x)
 
_check_output(self, y)
 
_check_train_args(self, x, *args, **kwargs)
 
_get_train_seq(self)
 
_if_training_stop_training(self)
 
_inverse(self, x)
 
_pre_execution_checks(self, x)
This method contains all pre-execution checks.
 
_pre_inversion_checks(self, y)
This method contains all pre-inversion checks.
 
_refcast(self, x)
Helper function to cast arrays to the internal dtype.
 
_set_dtype(self, t)
 
_set_input_dim(self, n)
 
_set_output_dim(self, n)
 
copy(self, protocol=None)
Return a deep copy of the node.
 
get_current_train_phase(self)
Return the index of the current training phase.
 
get_dtype(self)
Return dtype.
 
get_input_dim(self)
Return input dimensions.
 
get_output_dim(self)
Return output dimensions.
 
get_remaining_train_phase(self)
Return the number of training phases still to accomplish.
 
get_supported_dtypes(self)
Return dtypes supported by the node as a list of dtype objects.
 
has_multiple_training_phases(self)
Return True if the node has multiple training phases.
 
inverse(self, y, *args, **kwargs)
Invert y.
 
is_training(self)
Return True if the node is in the training phase, False otherwise.
 
save(self, filename, protocol=-1)
Save a pickled serialization of the node to filename. If filename is None, return a string.
 
set_dtype(self, t)
Set internal structures' dtype.
 
set_input_dim(self, n)
Set input dimensions.
 
set_output_dim(self, n)
Set output dimensions.
Static Methods [hide private]
 
is_invertible()
Return True if the node can be inverted, False otherwise.
 
is_trainable()
Return True if the node can be trained, False otherwise.
Properties [hide private]

Inherited from object: __class__

    Inherited from Node
  _train_seq
List of tuples:
  dtype
dtype
  input_dim
Input dimensions
  output_dim
Output dimensions
  supported_dtypes
Supported dtypes
Method Details [hide private]

__init__(self, input_dim=None, output_dim=None, dtype=None, **kwargs)
(Constructor)

 

Randomized Lasso.

This node has been automatically generated by wrapping the ``sklearn.linear_model.randomized_l1.RandomizedLasso`` class
from the ``sklearn`` library.  The wrapped instance can be accessed
through the ``scikits_alg`` attribute.

Randomized Lasso works by resampling the train data and computing
a Lasso on each resampling. In short, the features selected more
often are good features. It is also known as stability selection.

Read more in the :ref:`User Guide <randomized_l1>`.

**Parameters**

alpha : float, 'aic', or 'bic', optional
    The regularization parameter alpha parameter in the Lasso.
    Warning: this is not the alpha parameter in the stability selection
    article which is scaling.

scaling : float, optional
    The alpha parameter in the stability selection article used to
    randomly scale the features. Should be between 0 and 1.

sample_fraction : float, optional
    The fraction of samples to be used in each randomized design.
    Should be between 0 and 1. If 1, all samples are used.

n_resampling : int, optional
    Number of randomized models.

selection_threshold: float, optional
    The score above which features should be selected.

fit_intercept : boolean, optional
    whether to calculate the intercept for this model. If set
    to false, no intercept will be used in calculations
    (e.g. data is expected to be already centered).

verbose : boolean or integer, optional
    Sets the verbosity amount

normalize : boolean, optional, default True
    If True, the regressors X will be normalized before regression.

precompute : True | False | 'auto'
    Whether to use a precomputed Gram matrix to speed up
    calculations. If set to 'auto' let us decide. The Gram
    matrix can also be passed as argument.

max_iter : integer, optional
    Maximum number of iterations to perform in the Lars algorithm.

eps : float, optional
    The machine-precision regularization in the computation of the
    Cholesky diagonal factors. Increase this for very ill-conditioned
    systems. Unlike the 'tol' parameter in some iterative
    optimization-based algorithms, this parameter does not control
    the tolerance of the optimization.

n_jobs : integer, optional
    Number of CPUs to use during the resampling. If '-1', use
    all the CPUs

random_state : int, RandomState instance or None, optional (default=None)
    If int, random_state is the seed used by the random number generator;
    If RandomState instance, random_state is the random number generator;
    If None, the random number generator is the RandomState instance used
    by `np.random`.

pre_dispatch : int, or string, optional
    Controls the number of jobs that get dispatched during parallel
    execution. Reducing this number can be useful to avoid an
    explosion of memory consumption when more jobs get dispatched
    than CPUs can process. This parameter can be:


        - None, in which case all the jobs are immediately
          created and spawned. Use this for lightweight and
          fast-running jobs, to avoid delays due to on-demand
          spawning of the jobs

        - An int, giving the exact number of total jobs that are
          spawned

        - A string, giving an expression as a function of n_jobs,
          as in '2*n_jobs'

memory : Instance of joblib.Memory or string
    Used for internal caching. By default, no caching is done.
    If a string is given, it is the path to the caching directory.

**Attributes**

``scores_`` : array, shape = [n_features]
    Feature scores between 0 and 1.

``all_scores_`` : array, shape = [n_features, n_reg_parameter]
    Feature scores between 0 and 1 for all values of the regularization         parameter. The reference article suggests ``scores_`` is the max of         ``all_scores_``.

**Examples**

>>> from sklearn.linear_model import RandomizedLasso
>>> randomized_lasso = RandomizedLasso()

**Notes**

See examples/linear_model/plot_sparse_recovery.py for an example.

**References**

Stability selection
Nicolai Meinshausen, Peter Buhlmann
Journal of the Royal Statistical Society: Series B
Volume 72, Issue 4, pages 417-473, September 2010
DOI: 10.1111/j.1467-9868.2010.00740.x

See also

RandomizedLogisticRegression, LogisticRegression

Overrides: object.__init__

_execute(self, x)

 
Overrides: Node._execute

_get_supported_dtypes(self)

 
Return the list of dtypes supported by this node. The types can be specified in any format allowed by numpy.dtype.
Overrides: Node._get_supported_dtypes

_stop_training(self, **kwargs)

 
Concatenate the collected data in a single array.
Overrides: Node._stop_training

execute(self, x)

 
This node has been automatically generated by wrapping the sklearn.linear_model.randomized_l1.RandomizedLasso class from the sklearn library. The wrapped instance can be accessed through the scikits_alg attribute.
Overrides: Node.execute

is_invertible()
Static Method

 
Return True if the node can be inverted, False otherwise.
Overrides: Node.is_invertible
(inherited documentation)

is_trainable()
Static Method

 
Return True if the node can be trained, False otherwise.
Overrides: Node.is_trainable

stop_training(self, **kwargs)

 

Fit the model using X, y as training data.

This node has been automatically generated by wrapping the sklearn.linear_model.randomized_l1.RandomizedLasso class from the sklearn library. The wrapped instance can be accessed through the scikits_alg attribute.

Parameters

X : array-like, sparse matrix shape = [n_samples, n_features]
Training data.
y : array-like, shape = [n_samples]
Target values.

Returns

self : object
Returns an instance of self.
Overrides: Node.stop_training