Package mdp :: Package nodes :: Class FactorAnalysisScikitsLearnNode
[hide private]
[frames] | no frames]

Class FactorAnalysisScikitsLearnNode



Factor Analysis (FA)

This node has been automatically generated by wrapping the ``sklearn.decomposition.factor_analysis.FactorAnalysis`` class
from the ``sklearn`` library.  The wrapped instance can be accessed
through the ``scikits_alg`` attribute.

A simple linear generative model with Gaussian latent variables.

The observations are assumed to be caused by a linear transformation of
lower dimensional latent factors and added Gaussian noise.
Without loss of generality the factors are distributed according to a
Gaussian with zero mean and unit covariance. The noise is also zero mean
and has an arbitrary diagonal covariance matrix.

If we would restrict the model further, by assuming that the Gaussian
noise is even isotropic (all diagonal entries are the same) we would obtain
:class:`PPCA`.

FactorAnalysis performs a maximum likelihood estimate of the so-called
`loading` matrix, the transformation of the latent variables to the
observed ones, using expectation-maximization (EM).

Read more in the :ref:`User Guide <FA>`.

**Parameters**

n_components : int | None
    Dimensionality of latent space, the number of components
    of ``X`` that are obtained after ``transform``.
    If None, n_components is set to the number of features.

tol : float
    Stopping tolerance for EM algorithm.

copy : bool
    Whether to make a copy of X. If ``False``, the input X gets overwritten
    during fitting.

max_iter : int
    Maximum number of iterations.

noise_variance_init : None | array, shape=(n_features,)
    The initial guess of the noise variance for each feature.
    If None, it defaults to np.ones(n_features)

svd_method : {'lapack', 'randomized'}
    Which SVD method to use. If 'lapack' use standard SVD from
    scipy.linalg, if 'randomized' use fast ``randomized_svd`` function.
    Defaults to 'randomized'. For most applications 'randomized' will
    be sufficiently precise while providing significant speed gains.
    Accuracy can also be improved by setting higher values for
    `iterated_power`. If this is not sufficient, for maximum precision
    you should choose 'lapack'.

iterated_power : int, optional
    Number of iterations for the power method. 3 by default. Only used
    if ``svd_method`` equals 'randomized'

random_state : int or RandomState
    Pseudo number generator state used for random sampling. Only used
    if ``svd_method`` equals 'randomized'

**Attributes**

``components_`` : array, [n_components, n_features]
    Components with maximum variance.

``loglike_`` : list, [n_iterations]
    The log likelihood at each iteration.

``noise_variance_`` : array, shape=(n_features,)
    The estimated noise variance for each feature.

``n_iter_`` : int
    Number of iterations run.

**References**

.. David Barber, Bayesian Reasoning and Machine Learning,
    Algorithm 21.1

.. Christopher M. Bishop: Pattern Recognition and Machine Learning,
    Chapter 12.2.4

See also

PCA: Principal component analysis is also a latent linear variable model
    which however assumes equal noise variance for each feature.
    This extra assumption makes probabilistic PCA faster as it can be
    computed in closed form.
FastICA: Independent component analysis, a latent variable model with
    non-Gaussian latent variables.

Instance Methods [hide private]
 
__init__(self, input_dim=None, output_dim=None, dtype=None, **kwargs)
Factor Analysis (FA)
 
_execute(self, x)
 
_get_supported_dtypes(self)
Return the list of dtypes supported by this node. The types can be specified in any format allowed by numpy.dtype.
 
_stop_training(self, **kwargs)
Concatenate the collected data in a single array.
 
execute(self, x)
Apply dimensionality reduction to X using the model.
 
stop_training(self, **kwargs)
Fit the FactorAnalysis model to X using EM

Inherited from unreachable.newobject: __long__, __native__, __nonzero__, __unicode__, next

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __setattr__, __sizeof__, __subclasshook__

    Inherited from Cumulator
 
_train(self, *args)
Collect all input data in a list.
 
train(self, *args)
Collect all input data in a list.
    Inherited from Node
 
__add__(self, other)
 
__call__(self, x, *args, **kwargs)
Calling an instance of Node is equivalent to calling its execute method.
 
__repr__(self)
repr(x)
 
__str__(self)
str(x)
 
_check_input(self, x)
 
_check_output(self, y)
 
_check_train_args(self, x, *args, **kwargs)
 
_get_train_seq(self)
 
_if_training_stop_training(self)
 
_inverse(self, x)
 
_pre_execution_checks(self, x)
This method contains all pre-execution checks.
 
_pre_inversion_checks(self, y)
This method contains all pre-inversion checks.
 
_refcast(self, x)
Helper function to cast arrays to the internal dtype.
 
_set_dtype(self, t)
 
_set_input_dim(self, n)
 
_set_output_dim(self, n)
 
copy(self, protocol=None)
Return a deep copy of the node.
 
get_current_train_phase(self)
Return the index of the current training phase.
 
get_dtype(self)
Return dtype.
 
get_input_dim(self)
Return input dimensions.
 
get_output_dim(self)
Return output dimensions.
 
get_remaining_train_phase(self)
Return the number of training phases still to accomplish.
 
get_supported_dtypes(self)
Return dtypes supported by the node as a list of dtype objects.
 
has_multiple_training_phases(self)
Return True if the node has multiple training phases.
 
inverse(self, y, *args, **kwargs)
Invert y.
 
is_training(self)
Return True if the node is in the training phase, False otherwise.
 
save(self, filename, protocol=-1)
Save a pickled serialization of the node to filename. If filename is None, return a string.
 
set_dtype(self, t)
Set internal structures' dtype.
 
set_input_dim(self, n)
Set input dimensions.
 
set_output_dim(self, n)
Set output dimensions.
Static Methods [hide private]
 
is_invertible()
Return True if the node can be inverted, False otherwise.
 
is_trainable()
Return True if the node can be trained, False otherwise.
Properties [hide private]

Inherited from object: __class__

    Inherited from Node
  _train_seq
List of tuples:
  dtype
dtype
  input_dim
Input dimensions
  output_dim
Output dimensions
  supported_dtypes
Supported dtypes
Method Details [hide private]

__init__(self, input_dim=None, output_dim=None, dtype=None, **kwargs)
(Constructor)

 

Factor Analysis (FA)

This node has been automatically generated by wrapping the ``sklearn.decomposition.factor_analysis.FactorAnalysis`` class
from the ``sklearn`` library.  The wrapped instance can be accessed
through the ``scikits_alg`` attribute.

A simple linear generative model with Gaussian latent variables.

The observations are assumed to be caused by a linear transformation of
lower dimensional latent factors and added Gaussian noise.
Without loss of generality the factors are distributed according to a
Gaussian with zero mean and unit covariance. The noise is also zero mean
and has an arbitrary diagonal covariance matrix.

If we would restrict the model further, by assuming that the Gaussian
noise is even isotropic (all diagonal entries are the same) we would obtain
:class:`PPCA`.

FactorAnalysis performs a maximum likelihood estimate of the so-called
`loading` matrix, the transformation of the latent variables to the
observed ones, using expectation-maximization (EM).

Read more in the :ref:`User Guide <FA>`.

**Parameters**

n_components : int | None
    Dimensionality of latent space, the number of components
    of ``X`` that are obtained after ``transform``.
    If None, n_components is set to the number of features.

tol : float
    Stopping tolerance for EM algorithm.

copy : bool
    Whether to make a copy of X. If ``False``, the input X gets overwritten
    during fitting.

max_iter : int
    Maximum number of iterations.

noise_variance_init : None | array, shape=(n_features,)
    The initial guess of the noise variance for each feature.
    If None, it defaults to np.ones(n_features)

svd_method : {'lapack', 'randomized'}
    Which SVD method to use. If 'lapack' use standard SVD from
    scipy.linalg, if 'randomized' use fast ``randomized_svd`` function.
    Defaults to 'randomized'. For most applications 'randomized' will
    be sufficiently precise while providing significant speed gains.
    Accuracy can also be improved by setting higher values for
    `iterated_power`. If this is not sufficient, for maximum precision
    you should choose 'lapack'.

iterated_power : int, optional
    Number of iterations for the power method. 3 by default. Only used
    if ``svd_method`` equals 'randomized'

random_state : int or RandomState
    Pseudo number generator state used for random sampling. Only used
    if ``svd_method`` equals 'randomized'

**Attributes**

``components_`` : array, [n_components, n_features]
    Components with maximum variance.

``loglike_`` : list, [n_iterations]
    The log likelihood at each iteration.

``noise_variance_`` : array, shape=(n_features,)
    The estimated noise variance for each feature.

``n_iter_`` : int
    Number of iterations run.

**References**

.. David Barber, Bayesian Reasoning and Machine Learning,
    Algorithm 21.1

.. Christopher M. Bishop: Pattern Recognition and Machine Learning,
    Chapter 12.2.4

See also

PCA: Principal component analysis is also a latent linear variable model
    which however assumes equal noise variance for each feature.
    This extra assumption makes probabilistic PCA faster as it can be
    computed in closed form.
FastICA: Independent component analysis, a latent variable model with
    non-Gaussian latent variables.

Overrides: object.__init__

_execute(self, x)

 
Overrides: Node._execute

_get_supported_dtypes(self)

 
Return the list of dtypes supported by this node. The types can be specified in any format allowed by numpy.dtype.
Overrides: Node._get_supported_dtypes

_stop_training(self, **kwargs)

 
Concatenate the collected data in a single array.
Overrides: Node._stop_training

execute(self, x)

 

Apply dimensionality reduction to X using the model.

This node has been automatically generated by wrapping the sklearn.decomposition.factor_analysis.FactorAnalysis class from the sklearn library. The wrapped instance can be accessed through the scikits_alg attribute.

Compute the expected mean of the latent variables. See Barber, 21.2.33 (or Bishop, 12.66).

Parameters

X : array-like, shape (n_samples, n_features)
Training data.

Returns

X_new : array-like, shape (n_samples, n_components)
The latent variables of X.
Overrides: Node.execute

is_invertible()
Static Method

 
Return True if the node can be inverted, False otherwise.
Overrides: Node.is_invertible
(inherited documentation)

is_trainable()
Static Method

 
Return True if the node can be trained, False otherwise.
Overrides: Node.is_trainable

stop_training(self, **kwargs)

 

Fit the FactorAnalysis model to X using EM

This node has been automatically generated by wrapping the sklearn.decomposition.factor_analysis.FactorAnalysis class from the sklearn library. The wrapped instance can be accessed through the scikits_alg attribute.

Parameters

X : array-like, shape (n_samples, n_features)
Training data.

Returns

self

Overrides: Node.stop_training