Package mdp :: Package nodes :: Class LassoLarsICScikitsLearnNode
[hide private]
[frames] | no frames]

Class LassoLarsICScikitsLearnNode



Lasso model fit with Lars using BIC or AIC for model selection

This node has been automatically generated by wrapping the ``sklearn.linear_model.least_angle.LassoLarsIC`` class
from the ``sklearn`` library.  The wrapped instance can be accessed
through the ``scikits_alg`` attribute.

The optimization objective for Lasso is::


(1 / (2 * n_samples)) * ||y - Xw||^2_2 + alpha * ||w||_1

AIC is the Akaike information criterion and BIC is the Bayes
Information criterion. Such criteria are useful to select the value
of the regularization parameter by making a trade-off between the
goodness of fit and the complexity of the model. A good model should
explain well the data while being simple.

Read more in the :ref:`User Guide <least_angle_regression>`.

**Parameters**

criterion : 'bic' | 'aic'
    The type of criterion to use.

fit_intercept : boolean
    whether to calculate the intercept for this model. If set
    to false, no intercept will be used in calculations
    (e.g. data is expected to be already centered).

positive : boolean (default=False)
    Restrict coefficients to be >= 0. Be aware that you might want to
    remove fit_intercept which is set True by default.
    Under the positive restriction the model coefficients do not converge
    to the ordinary-least-squares solution for small values of alpha.
    Only coeffiencts up to the smallest alpha value (``alphas_[alphas_ >
    0.].min()`` when fit_path=True) reached by the stepwise Lars-Lasso
    algorithm are typically in congruence with the solution of the
    coordinate descent Lasso estimator.
    As a consequence using LassoLarsIC only makes sense for problems where
    a sparse solution is expected and/or reached.

verbose : boolean or integer, optional
    Sets the verbosity amount

normalize : boolean, optional, default False
    If True, the regressors X will be normalized before regression.

copy_X : boolean, optional, default True
    If True, X will be copied; else, it may be overwritten.

precompute : True | False | 'auto' | array-like
    Whether to use a precomputed Gram matrix to speed up
    calculations. If set to ``'auto'`` let us decide. The Gram
    matrix can also be passed as argument.

max_iter : integer, optional
    Maximum number of iterations to perform. Can be used for
    early stopping.

eps : float, optional
    The machine-precision regularization in the computation of the
    Cholesky diagonal factors. Increase this for very ill-conditioned
    systems. Unlike the ``tol`` parameter in some iterative
    optimization-based algorithms, this parameter does not control
    the tolerance of the optimization.


**Attributes**

``coef_`` : array, shape (n_features,)
    parameter vector (w in the formulation formula)

``intercept_`` : float
    independent term in decision function.

``alpha_`` : float
    the alpha parameter chosen by the information criterion

``n_iter_`` : int
    number of iterations run by lars_path to find the grid of
    alphas.

``criterion_`` : array, shape (n_alphas,)
    The value of the information criteria ('aic', 'bic') across all
    alphas. The alpha which has the smallest information criteria
    is chosen.

**Examples**

>>> from sklearn import linear_model
>>> clf = linear_model.LassoLarsIC(criterion='bic')
>>> clf.fit([[-1, 1], [0, 0], [1, 1]], [-1.1111, 0, -1.1111])
... # doctest: +ELLIPSIS, +NORMALIZE_WHITESPACE
LassoLarsIC(copy_X=True, criterion='bic', eps=..., fit_intercept=True,
      max_iter=500, normalize=True, positive=False, precompute='auto',
      verbose=False)
>>> print(clf.coef_) # doctest: +ELLIPSIS, +NORMALIZE_WHITESPACE
[ 0.  -1.11...]

**Notes**

The estimation of the number of degrees of freedom is given by:


"On the degrees of freedom of the lasso"
Hui Zou, Trevor Hastie, and Robert Tibshirani
Ann. Statist. Volume 35, Number 5 (2007), 2173-2192.

http://en.wikipedia.org/wiki/Akaike_information_criterion
http://en.wikipedia.org/wiki/Bayesian_information_criterion

See also

lars_path, LassoLars, LassoLarsCV

Instance Methods [hide private]
 
__init__(self, input_dim=None, output_dim=None, dtype=None, **kwargs)
Lasso model fit with Lars using BIC or AIC for model selection
 
_execute(self, x)
 
_get_supported_dtypes(self)
Return the list of dtypes supported by this node. The types can be specified in any format allowed by numpy.dtype.
 
_stop_training(self, **kwargs)
Concatenate the collected data in a single array.
 
execute(self, x)
Predict using the linear model
 
stop_training(self, **kwargs)
Fit the model using X, y as training data.

Inherited from unreachable.newobject: __long__, __native__, __nonzero__, __unicode__, next

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __setattr__, __sizeof__, __subclasshook__

    Inherited from Cumulator
 
_train(self, *args)
Collect all input data in a list.
 
train(self, *args)
Collect all input data in a list.
    Inherited from Node
 
__add__(self, other)
 
__call__(self, x, *args, **kwargs)
Calling an instance of Node is equivalent to calling its execute method.
 
__repr__(self)
repr(x)
 
__str__(self)
str(x)
 
_check_input(self, x)
 
_check_output(self, y)
 
_check_train_args(self, x, *args, **kwargs)
 
_get_train_seq(self)
 
_if_training_stop_training(self)
 
_inverse(self, x)
 
_pre_execution_checks(self, x)
This method contains all pre-execution checks.
 
_pre_inversion_checks(self, y)
This method contains all pre-inversion checks.
 
_refcast(self, x)
Helper function to cast arrays to the internal dtype.
 
_set_dtype(self, t)
 
_set_input_dim(self, n)
 
_set_output_dim(self, n)
 
copy(self, protocol=None)
Return a deep copy of the node.
 
get_current_train_phase(self)
Return the index of the current training phase.
 
get_dtype(self)
Return dtype.
 
get_input_dim(self)
Return input dimensions.
 
get_output_dim(self)
Return output dimensions.
 
get_remaining_train_phase(self)
Return the number of training phases still to accomplish.
 
get_supported_dtypes(self)
Return dtypes supported by the node as a list of dtype objects.
 
has_multiple_training_phases(self)
Return True if the node has multiple training phases.
 
inverse(self, y, *args, **kwargs)
Invert y.
 
is_training(self)
Return True if the node is in the training phase, False otherwise.
 
save(self, filename, protocol=-1)
Save a pickled serialization of the node to filename. If filename is None, return a string.
 
set_dtype(self, t)
Set internal structures' dtype.
 
set_input_dim(self, n)
Set input dimensions.
 
set_output_dim(self, n)
Set output dimensions.
Static Methods [hide private]
 
is_invertible()
Return True if the node can be inverted, False otherwise.
 
is_trainable()
Return True if the node can be trained, False otherwise.
Properties [hide private]

Inherited from object: __class__

    Inherited from Node
  _train_seq
List of tuples:
  dtype
dtype
  input_dim
Input dimensions
  output_dim
Output dimensions
  supported_dtypes
Supported dtypes
Method Details [hide private]

__init__(self, input_dim=None, output_dim=None, dtype=None, **kwargs)
(Constructor)

 

Lasso model fit with Lars using BIC or AIC for model selection

This node has been automatically generated by wrapping the ``sklearn.linear_model.least_angle.LassoLarsIC`` class
from the ``sklearn`` library.  The wrapped instance can be accessed
through the ``scikits_alg`` attribute.

The optimization objective for Lasso is::


(1 / (2 * n_samples)) * ||y - Xw||^2_2 + alpha * ||w||_1

AIC is the Akaike information criterion and BIC is the Bayes
Information criterion. Such criteria are useful to select the value
of the regularization parameter by making a trade-off between the
goodness of fit and the complexity of the model. A good model should
explain well the data while being simple.

Read more in the :ref:`User Guide <least_angle_regression>`.

**Parameters**

criterion : 'bic' | 'aic'
    The type of criterion to use.

fit_intercept : boolean
    whether to calculate the intercept for this model. If set
    to false, no intercept will be used in calculations
    (e.g. data is expected to be already centered).

positive : boolean (default=False)
    Restrict coefficients to be >= 0. Be aware that you might want to
    remove fit_intercept which is set True by default.
    Under the positive restriction the model coefficients do not converge
    to the ordinary-least-squares solution for small values of alpha.
    Only coeffiencts up to the smallest alpha value (``alphas_[alphas_ >
    0.].min()`` when fit_path=True) reached by the stepwise Lars-Lasso
    algorithm are typically in congruence with the solution of the
    coordinate descent Lasso estimator.
    As a consequence using LassoLarsIC only makes sense for problems where
    a sparse solution is expected and/or reached.

verbose : boolean or integer, optional
    Sets the verbosity amount

normalize : boolean, optional, default False
    If True, the regressors X will be normalized before regression.

copy_X : boolean, optional, default True
    If True, X will be copied; else, it may be overwritten.

precompute : True | False | 'auto' | array-like
    Whether to use a precomputed Gram matrix to speed up
    calculations. If set to ``'auto'`` let us decide. The Gram
    matrix can also be passed as argument.

max_iter : integer, optional
    Maximum number of iterations to perform. Can be used for
    early stopping.

eps : float, optional
    The machine-precision regularization in the computation of the
    Cholesky diagonal factors. Increase this for very ill-conditioned
    systems. Unlike the ``tol`` parameter in some iterative
    optimization-based algorithms, this parameter does not control
    the tolerance of the optimization.


**Attributes**

``coef_`` : array, shape (n_features,)
    parameter vector (w in the formulation formula)

``intercept_`` : float
    independent term in decision function.

``alpha_`` : float
    the alpha parameter chosen by the information criterion

``n_iter_`` : int
    number of iterations run by lars_path to find the grid of
    alphas.

``criterion_`` : array, shape (n_alphas,)
    The value of the information criteria ('aic', 'bic') across all
    alphas. The alpha which has the smallest information criteria
    is chosen.

**Examples**

>>> from sklearn import linear_model
>>> clf = linear_model.LassoLarsIC(criterion='bic')
>>> clf.fit([[-1, 1], [0, 0], [1, 1]], [-1.1111, 0, -1.1111])
... # doctest: +ELLIPSIS, +NORMALIZE_WHITESPACE
LassoLarsIC(copy_X=True, criterion='bic', eps=..., fit_intercept=True,
      max_iter=500, normalize=True, positive=False, precompute='auto',
      verbose=False)
>>> print(clf.coef_) # doctest: +ELLIPSIS, +NORMALIZE_WHITESPACE
[ 0.  -1.11...]

**Notes**

The estimation of the number of degrees of freedom is given by:


"On the degrees of freedom of the lasso"
Hui Zou, Trevor Hastie, and Robert Tibshirani
Ann. Statist. Volume 35, Number 5 (2007), 2173-2192.

http://en.wikipedia.org/wiki/Akaike_information_criterion
http://en.wikipedia.org/wiki/Bayesian_information_criterion

See also

lars_path, LassoLars, LassoLarsCV

Overrides: object.__init__

_execute(self, x)

 
Overrides: Node._execute

_get_supported_dtypes(self)

 
Return the list of dtypes supported by this node. The types can be specified in any format allowed by numpy.dtype.
Overrides: Node._get_supported_dtypes

_stop_training(self, **kwargs)

 
Concatenate the collected data in a single array.
Overrides: Node._stop_training

execute(self, x)

 

Predict using the linear model

This node has been automatically generated by wrapping the sklearn.linear_model.least_angle.LassoLarsIC class from the sklearn library. The wrapped instance can be accessed through the scikits_alg attribute.

Parameters

X : {array-like, sparse matrix}, shape = (n_samples, n_features)
Samples.

Returns

C : array, shape = (n_samples,)
Returns predicted values.
Overrides: Node.execute

is_invertible()
Static Method

 
Return True if the node can be inverted, False otherwise.
Overrides: Node.is_invertible
(inherited documentation)

is_trainable()
Static Method

 
Return True if the node can be trained, False otherwise.
Overrides: Node.is_trainable

stop_training(self, **kwargs)

 

Fit the model using X, y as training data.

This node has been automatically generated by wrapping the sklearn.linear_model.least_angle.LassoLarsIC class from the sklearn library. The wrapped instance can be accessed through the scikits_alg attribute.

Parameters

X : array-like, shape (n_samples, n_features)
training data.
y : array-like, shape (n_samples,)
target values.
copy_X : boolean, optional, default True
If True, X will be copied; else, it may be overwritten.

Returns

self : object
returns an instance of self.
Overrides: Node.stop_training