Package mdp :: Package nodes :: Class MiniBatchDictionaryLearningScikitsLearnNode
[hide private]
[frames] | no frames]

Class MiniBatchDictionaryLearningScikitsLearnNode



Mini-batch dictionary learning

This node has been automatically generated by wrapping the ``sklearn.decomposition.dict_learning.MiniBatchDictionaryLearning`` class
from the ``sklearn`` library.  The wrapped instance can be accessed
through the ``scikits_alg`` attribute.

Finds a dictionary (a set of atoms) that can best be used to represent data
using a sparse code.

Solves the optimization problem::


   (U^*,V^*) = argmin 0.5 || Y - U V ||_2^2 + alpha * || U ||_1
                (U,V)
                with || V_k ||_2 = 1 for all  0 <= k < n_components

Read more in the :ref:`User Guide <DictionaryLearning>`.

**Parameters**

n_components : int,
    number of dictionary elements to extract

alpha : float,
    sparsity controlling parameter

n_iter : int,
    total number of iterations to perform

fit_algorithm : {'lars', 'cd'}
    lars: uses the least angle regression method to solve the lasso problem
    (linear_model.lars_path)
    cd: uses the coordinate descent method to compute the
    Lasso solution (linear_model.Lasso). Lars will be faster if
    the estimated components are sparse.

transform_algorithm : {'lasso_lars', 'lasso_cd', 'lars', 'omp',     'threshold'}
    Algorithm used to transform the data.
    lars: uses the least angle regression method (linear_model.lars_path)
    lasso_lars: uses Lars to compute the Lasso solution
    lasso_cd: uses the coordinate descent method to compute the
    Lasso solution (linear_model.Lasso). lasso_lars will be faster if
    the estimated components are sparse.
    omp: uses orthogonal matching pursuit to estimate the sparse solution
    threshold: squashes to zero all coefficients less than alpha from
    the projection dictionary * X'

transform_n_nonzero_coefs : int, ``0.1 * n_features`` by default
    Number of nonzero coefficients to target in each column of the
    solution. This is only used by `algorithm='lars'` and `algorithm='omp'`
    and is overridden by `alpha` in the `omp` case.

transform_alpha : float, 1. by default
    If `algorithm='lasso_lars'` or `algorithm='lasso_cd'`, `alpha` is the
    penalty applied to the L1 norm.
    If `algorithm='threshold'`, `alpha` is the absolute value of the
    threshold below which coefficients will be squashed to zero.
    If `algorithm='omp'`, `alpha` is the tolerance parameter: the value of
    the reconstruction error targeted. In this case, it overrides
    `n_nonzero_coefs`.

split_sign : bool, False by default
    Whether to split the sparse feature vector into the concatenation of
    its negative part and its positive part. This can improve the
    performance of downstream classifiers.

n_jobs : int,
    number of parallel jobs to run

dict_init : array of shape (n_components, n_features),
    initial value of the dictionary for warm restart scenarios

verbose :

    - degree of verbosity of the printed output


batch_size : int,
    number of samples in each mini-batch

shuffle : bool,
    whether to shuffle the samples before forming batches

random_state : int or RandomState
    Pseudo number generator state used for random sampling.

**Attributes**

``components_`` : array, [n_components, n_features]
    components extracted from the data

``inner_stats_`` : tuple of (A, B) ndarrays
    Internal sufficient statistics that are kept by the algorithm.
    Keeping them is useful in online settings, to avoid loosing the
    history of the evolution, but they shouldn't have any use for the
    end user.
    A (n_components, n_components) is the dictionary covariance matrix.
    B (n_features, n_components) is the data approximation matrix

``n_iter_`` : int
    Number of iterations run.

**Notes**

**References:**

J. Mairal, F. Bach, J. Ponce, G. Sapiro, 2009: Online dictionary learning
for sparse coding (http://www.di.ens.fr/sierra/pdfs/icml09.pdf)

See also

SparseCoder
DictionaryLearning
SparsePCA
MiniBatchSparsePCA

Instance Methods [hide private]
 
__init__(self, input_dim=None, output_dim=None, dtype=None, **kwargs)
Mini-batch dictionary learning
 
_execute(self, x)
 
_get_supported_dtypes(self)
Return the list of dtypes supported by this node. The types can be specified in any format allowed by numpy.dtype.
 
_stop_training(self, **kwargs)
Concatenate the collected data in a single array.
 
execute(self, x)
Encode the data as a sparse combination of the dictionary atoms.
 
stop_training(self, **kwargs)
Fit the model from data in X.

Inherited from unreachable.newobject: __long__, __native__, __nonzero__, __unicode__, next

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __setattr__, __sizeof__, __subclasshook__

    Inherited from Cumulator
 
_train(self, *args)
Collect all input data in a list.
 
train(self, *args)
Collect all input data in a list.
    Inherited from Node
 
__add__(self, other)
 
__call__(self, x, *args, **kwargs)
Calling an instance of Node is equivalent to calling its execute method.
 
__repr__(self)
repr(x)
 
__str__(self)
str(x)
 
_check_input(self, x)
 
_check_output(self, y)
 
_check_train_args(self, x, *args, **kwargs)
 
_get_train_seq(self)
 
_if_training_stop_training(self)
 
_inverse(self, x)
 
_pre_execution_checks(self, x)
This method contains all pre-execution checks.
 
_pre_inversion_checks(self, y)
This method contains all pre-inversion checks.
 
_refcast(self, x)
Helper function to cast arrays to the internal dtype.
 
_set_dtype(self, t)
 
_set_input_dim(self, n)
 
_set_output_dim(self, n)
 
copy(self, protocol=None)
Return a deep copy of the node.
 
get_current_train_phase(self)
Return the index of the current training phase.
 
get_dtype(self)
Return dtype.
 
get_input_dim(self)
Return input dimensions.
 
get_output_dim(self)
Return output dimensions.
 
get_remaining_train_phase(self)
Return the number of training phases still to accomplish.
 
get_supported_dtypes(self)
Return dtypes supported by the node as a list of dtype objects.
 
has_multiple_training_phases(self)
Return True if the node has multiple training phases.
 
inverse(self, y, *args, **kwargs)
Invert y.
 
is_training(self)
Return True if the node is in the training phase, False otherwise.
 
save(self, filename, protocol=-1)
Save a pickled serialization of the node to filename. If filename is None, return a string.
 
set_dtype(self, t)
Set internal structures' dtype.
 
set_input_dim(self, n)
Set input dimensions.
 
set_output_dim(self, n)
Set output dimensions.
Static Methods [hide private]
 
is_invertible()
Return True if the node can be inverted, False otherwise.
 
is_trainable()
Return True if the node can be trained, False otherwise.
Properties [hide private]

Inherited from object: __class__

    Inherited from Node
  _train_seq
List of tuples:
  dtype
dtype
  input_dim
Input dimensions
  output_dim
Output dimensions
  supported_dtypes
Supported dtypes
Method Details [hide private]

__init__(self, input_dim=None, output_dim=None, dtype=None, **kwargs)
(Constructor)

 

Mini-batch dictionary learning

This node has been automatically generated by wrapping the ``sklearn.decomposition.dict_learning.MiniBatchDictionaryLearning`` class
from the ``sklearn`` library.  The wrapped instance can be accessed
through the ``scikits_alg`` attribute.

Finds a dictionary (a set of atoms) that can best be used to represent data
using a sparse code.

Solves the optimization problem::


   (U^*,V^*) = argmin 0.5 || Y - U V ||_2^2 + alpha * || U ||_1
                (U,V)
                with || V_k ||_2 = 1 for all  0 <= k < n_components

Read more in the :ref:`User Guide <DictionaryLearning>`.

**Parameters**

n_components : int,
    number of dictionary elements to extract

alpha : float,
    sparsity controlling parameter

n_iter : int,
    total number of iterations to perform

fit_algorithm : {'lars', 'cd'}
    lars: uses the least angle regression method to solve the lasso problem
    (linear_model.lars_path)
    cd: uses the coordinate descent method to compute the
    Lasso solution (linear_model.Lasso). Lars will be faster if
    the estimated components are sparse.

transform_algorithm : {'lasso_lars', 'lasso_cd', 'lars', 'omp',     'threshold'}
    Algorithm used to transform the data.
    lars: uses the least angle regression method (linear_model.lars_path)
    lasso_lars: uses Lars to compute the Lasso solution
    lasso_cd: uses the coordinate descent method to compute the
    Lasso solution (linear_model.Lasso). lasso_lars will be faster if
    the estimated components are sparse.
    omp: uses orthogonal matching pursuit to estimate the sparse solution
    threshold: squashes to zero all coefficients less than alpha from
    the projection dictionary * X'

transform_n_nonzero_coefs : int, ``0.1 * n_features`` by default
    Number of nonzero coefficients to target in each column of the
    solution. This is only used by `algorithm='lars'` and `algorithm='omp'`
    and is overridden by `alpha` in the `omp` case.

transform_alpha : float, 1. by default
    If `algorithm='lasso_lars'` or `algorithm='lasso_cd'`, `alpha` is the
    penalty applied to the L1 norm.
    If `algorithm='threshold'`, `alpha` is the absolute value of the
    threshold below which coefficients will be squashed to zero.
    If `algorithm='omp'`, `alpha` is the tolerance parameter: the value of
    the reconstruction error targeted. In this case, it overrides
    `n_nonzero_coefs`.

split_sign : bool, False by default
    Whether to split the sparse feature vector into the concatenation of
    its negative part and its positive part. This can improve the
    performance of downstream classifiers.

n_jobs : int,
    number of parallel jobs to run

dict_init : array of shape (n_components, n_features),
    initial value of the dictionary for warm restart scenarios

verbose :

    - degree of verbosity of the printed output


batch_size : int,
    number of samples in each mini-batch

shuffle : bool,
    whether to shuffle the samples before forming batches

random_state : int or RandomState
    Pseudo number generator state used for random sampling.

**Attributes**

``components_`` : array, [n_components, n_features]
    components extracted from the data

``inner_stats_`` : tuple of (A, B) ndarrays
    Internal sufficient statistics that are kept by the algorithm.
    Keeping them is useful in online settings, to avoid loosing the
    history of the evolution, but they shouldn't have any use for the
    end user.
    A (n_components, n_components) is the dictionary covariance matrix.
    B (n_features, n_components) is the data approximation matrix

``n_iter_`` : int
    Number of iterations run.

**Notes**

**References:**

J. Mairal, F. Bach, J. Ponce, G. Sapiro, 2009: Online dictionary learning
for sparse coding (http://www.di.ens.fr/sierra/pdfs/icml09.pdf)

See also

SparseCoder
DictionaryLearning
SparsePCA
MiniBatchSparsePCA

Overrides: object.__init__

_execute(self, x)

 
Overrides: Node._execute

_get_supported_dtypes(self)

 
Return the list of dtypes supported by this node. The types can be specified in any format allowed by numpy.dtype.
Overrides: Node._get_supported_dtypes

_stop_training(self, **kwargs)

 
Concatenate the collected data in a single array.
Overrides: Node._stop_training

execute(self, x)

 

Encode the data as a sparse combination of the dictionary atoms.

This node has been automatically generated by wrapping the sklearn.decomposition.dict_learning.MiniBatchDictionaryLearning class from the sklearn library. The wrapped instance can be accessed through the scikits_alg attribute.

Coding method is determined by the object parameter transform_algorithm.

Parameters

X : array of shape (n_samples, n_features)
Test data to be transformed, must have the same number of features as the data used to train the model.

Returns

X_new : array, shape (n_samples, n_components)
Transformed data
Overrides: Node.execute

is_invertible()
Static Method

 
Return True if the node can be inverted, False otherwise.
Overrides: Node.is_invertible
(inherited documentation)

is_trainable()
Static Method

 
Return True if the node can be trained, False otherwise.
Overrides: Node.is_trainable

stop_training(self, **kwargs)

 

Fit the model from data in X.

This node has been automatically generated by wrapping the sklearn.decomposition.dict_learning.MiniBatchDictionaryLearning class from the sklearn library. The wrapped instance can be accessed through the scikits_alg attribute.

Parameters

X: array-like, shape (n_samples, n_features)
Training vector, where n_samples in the number of samples and n_features is the number of features.

Returns

self : object
Returns the instance itself.
Overrides: Node.stop_training