Package mdp :: Package nodes :: Class PLSCanonicalScikitsLearnNode
[hide private]
[frames] | no frames]

Class PLSCanonicalScikitsLearnNode



PLSCanonical implements the 2 blocks canonical PLS of the original Wold
algorithm [Tenenhaus 1998] p.204, referred as PLS-C2A in [Wegelin 2000].

This node has been automatically generated by wrapping the ``sklearn.cross_decomposition.pls_.PLSCanonical`` class
from the ``sklearn`` library.  The wrapped instance can be accessed
through the ``scikits_alg`` attribute.

This class inherits from PLS with mode="A" and deflation_mode="canonical",
norm_y_weights=True and algorithm="nipals", but svd should provide similar
results up to numerical errors.

Read more in the :ref:`User Guide <cross_decomposition>`.

**Parameters**

scale : boolean, scale data? (default True)

algorithm : string, "nipals" or "svd"
    The algorithm used to estimate the weights. It will be called
    n_components times, i.e. once for each iteration of the outer loop.

max_iter : an integer, (default 500)
    the maximum number of iterations of the NIPALS inner loop (used
    only if algorithm="nipals")

tol : non-negative real, default 1e-06
    the tolerance used in the iterative algorithm

copy : boolean, default True
    Whether the deflation should be done on a copy. Let the default
    value to True unless you don't care about side effect

n_components : int, number of components to keep. (default 2).

**Attributes**

``x_weights_`` : array, shape = [p, n_components]
    X block weights vectors.

``y_weights_`` : array, shape = [q, n_components]
    Y block weights vectors.

``x_loadings_`` : array, shape = [p, n_components]
    X block loadings vectors.

``y_loadings_`` : array, shape = [q, n_components]
    Y block loadings vectors.

``x_scores_`` : array, shape = [n_samples, n_components]
    X scores.

``y_scores_`` : array, shape = [n_samples, n_components]
    Y scores.

``x_rotations_`` : array, shape = [p, n_components]
    X block to latents rotations.

``y_rotations_`` : array, shape = [q, n_components]
    Y block to latents rotations.

``n_iter_`` : array-like
    Number of iterations of the NIPALS inner loop for each
    component. Not useful if the algorithm provided is "svd".

**Notes**

Matrices::


    T: ``x_scores_``
    U: ``y_scores_``
    W: ``x_weights_``
    C: ``y_weights_``
    P: ``x_loadings_``
    Q: ``y_loadings__``

Are computed such that::


    X = T P.T + Err and Y = U Q.T + Err
    T[:, k] = Xk W[:, k] for k in range(n_components)
    U[:, k] = Yk C[:, k] for k in range(n_components)
    ``x_rotations_`` = W (P.T W)^(-1)
    ``y_rotations_`` = C (Q.T C)^(-1)

where Xk and Yk are residual matrices at iteration k.

`Slides explaining PLS <http://www.eigenvector.com/Docs/Wise_pls_properties.pdf>`

For each component k, find weights u, v that optimize::


    max corr(Xk u, Yk v) * std(Xk u) std(Yk u), such that ``|u| = |v| = 1``

Note that it maximizes both the correlations between the scores and the
intra-block variances.

The residual matrix of X (Xk+1) block is obtained by the deflation on the
current X score: x_score.

The residual matrix of Y (Yk+1) block is obtained by deflation on the
current Y score. This performs a canonical symmetric version of the PLS
regression. But slightly different than the CCA. This is mostly used
for modeling.

This implementation provides the same results that the "plspm" package
provided in the R language (R-project), using the function plsca(X, Y).
Results are equal or collinear with the function
``pls(..., mode = "canonical")`` of the "mixOmics" package. The difference
relies in the fact that mixOmics implementation does not exactly implement
the Wold algorithm since it does not normalize y_weights to one.

**Examples**

>>> from sklearn.cross_decomposition import PLSCanonical
>>> X = [[0., 0., 1.], [1.,0.,0.], [2.,2.,2.], [2.,5.,4.]]
>>> Y = [[0.1, -0.2], [0.9, 1.1], [6.2, 5.9], [11.9, 12.3]]
>>> plsca = PLSCanonical(n_components=2)
>>> plsca.fit(X, Y)
... # doctest: +NORMALIZE_WHITESPACE
PLSCanonical(algorithm='nipals', copy=True, max_iter=500, n_components=2,
             scale=True, tol=1e-06)
>>> X_c, Y_c = plsca.transform(X, Y)

**References**


Jacob A. Wegelin. A survey of Partial Least Squares (PLS) methods, with
emphasis on the two-block case. Technical Report 371, Department of
Statistics, University of Washington, Seattle, 2000.

Tenenhaus, M. (1998). La regression PLS: theorie et pratique. Paris:

Editions Technic.

See also

CCA
PLSSVD

Instance Methods [hide private]
 
__init__(self, input_dim=None, output_dim=None, dtype=None, **kwargs)
PLSCanonical implements the 2 blocks canonical PLS of the original Wold algorithm [Tenenhaus 1998] p.204, referred as PLS-C2A in [Wegelin 2000].
 
_execute(self, x)
 
_get_supported_dtypes(self)
Return the list of dtypes supported by this node. The types can be specified in any format allowed by numpy.dtype.
 
_stop_training(self, **kwargs)
Concatenate the collected data in a single array.
 
execute(self, x)
Apply the dimension reduction learned on the train data.
 
stop_training(self, **kwargs)
Fit model to data.

Inherited from unreachable.newobject: __long__, __native__, __nonzero__, __unicode__, next

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __setattr__, __sizeof__, __subclasshook__

    Inherited from Cumulator
 
_train(self, *args)
Collect all input data in a list.
 
train(self, *args)
Collect all input data in a list.
    Inherited from Node
 
__add__(self, other)
 
__call__(self, x, *args, **kwargs)
Calling an instance of Node is equivalent to calling its execute method.
 
__repr__(self)
repr(x)
 
__str__(self)
str(x)
 
_check_input(self, x)
 
_check_output(self, y)
 
_check_train_args(self, x, *args, **kwargs)
 
_get_train_seq(self)
 
_if_training_stop_training(self)
 
_inverse(self, x)
 
_pre_execution_checks(self, x)
This method contains all pre-execution checks.
 
_pre_inversion_checks(self, y)
This method contains all pre-inversion checks.
 
_refcast(self, x)
Helper function to cast arrays to the internal dtype.
 
_set_dtype(self, t)
 
_set_input_dim(self, n)
 
_set_output_dim(self, n)
 
copy(self, protocol=None)
Return a deep copy of the node.
 
get_current_train_phase(self)
Return the index of the current training phase.
 
get_dtype(self)
Return dtype.
 
get_input_dim(self)
Return input dimensions.
 
get_output_dim(self)
Return output dimensions.
 
get_remaining_train_phase(self)
Return the number of training phases still to accomplish.
 
get_supported_dtypes(self)
Return dtypes supported by the node as a list of dtype objects.
 
has_multiple_training_phases(self)
Return True if the node has multiple training phases.
 
inverse(self, y, *args, **kwargs)
Invert y.
 
is_training(self)
Return True if the node is in the training phase, False otherwise.
 
save(self, filename, protocol=-1)
Save a pickled serialization of the node to filename. If filename is None, return a string.
 
set_dtype(self, t)
Set internal structures' dtype.
 
set_input_dim(self, n)
Set input dimensions.
 
set_output_dim(self, n)
Set output dimensions.
Static Methods [hide private]
 
is_invertible()
Return True if the node can be inverted, False otherwise.
 
is_trainable()
Return True if the node can be trained, False otherwise.
Properties [hide private]

Inherited from object: __class__

    Inherited from Node
  _train_seq
List of tuples:
  dtype
dtype
  input_dim
Input dimensions
  output_dim
Output dimensions
  supported_dtypes
Supported dtypes
Method Details [hide private]

__init__(self, input_dim=None, output_dim=None, dtype=None, **kwargs)
(Constructor)

 

PLSCanonical implements the 2 blocks canonical PLS of the original Wold
algorithm [Tenenhaus 1998] p.204, referred as PLS-C2A in [Wegelin 2000].

This node has been automatically generated by wrapping the ``sklearn.cross_decomposition.pls_.PLSCanonical`` class
from the ``sklearn`` library.  The wrapped instance can be accessed
through the ``scikits_alg`` attribute.

This class inherits from PLS with mode="A" and deflation_mode="canonical",
norm_y_weights=True and algorithm="nipals", but svd should provide similar
results up to numerical errors.

Read more in the :ref:`User Guide <cross_decomposition>`.

**Parameters**

scale : boolean, scale data? (default True)

algorithm : string, "nipals" or "svd"
    The algorithm used to estimate the weights. It will be called
    n_components times, i.e. once for each iteration of the outer loop.

max_iter : an integer, (default 500)
    the maximum number of iterations of the NIPALS inner loop (used
    only if algorithm="nipals")

tol : non-negative real, default 1e-06
    the tolerance used in the iterative algorithm

copy : boolean, default True
    Whether the deflation should be done on a copy. Let the default
    value to True unless you don't care about side effect

n_components : int, number of components to keep. (default 2).

**Attributes**

``x_weights_`` : array, shape = [p, n_components]
    X block weights vectors.

``y_weights_`` : array, shape = [q, n_components]
    Y block weights vectors.

``x_loadings_`` : array, shape = [p, n_components]
    X block loadings vectors.

``y_loadings_`` : array, shape = [q, n_components]
    Y block loadings vectors.

``x_scores_`` : array, shape = [n_samples, n_components]
    X scores.

``y_scores_`` : array, shape = [n_samples, n_components]
    Y scores.

``x_rotations_`` : array, shape = [p, n_components]
    X block to latents rotations.

``y_rotations_`` : array, shape = [q, n_components]
    Y block to latents rotations.

``n_iter_`` : array-like
    Number of iterations of the NIPALS inner loop for each
    component. Not useful if the algorithm provided is "svd".

**Notes**

Matrices::


    T: ``x_scores_``
    U: ``y_scores_``
    W: ``x_weights_``
    C: ``y_weights_``
    P: ``x_loadings_``
    Q: ``y_loadings__``

Are computed such that::


    X = T P.T + Err and Y = U Q.T + Err
    T[:, k] = Xk W[:, k] for k in range(n_components)
    U[:, k] = Yk C[:, k] for k in range(n_components)
    ``x_rotations_`` = W (P.T W)^(-1)
    ``y_rotations_`` = C (Q.T C)^(-1)

where Xk and Yk are residual matrices at iteration k.

`Slides explaining PLS <http://www.eigenvector.com/Docs/Wise_pls_properties.pdf>`

For each component k, find weights u, v that optimize::


    max corr(Xk u, Yk v) * std(Xk u) std(Yk u), such that ``|u| = |v| = 1``

Note that it maximizes both the correlations between the scores and the
intra-block variances.

The residual matrix of X (Xk+1) block is obtained by the deflation on the
current X score: x_score.

The residual matrix of Y (Yk+1) block is obtained by deflation on the
current Y score. This performs a canonical symmetric version of the PLS
regression. But slightly different than the CCA. This is mostly used
for modeling.

This implementation provides the same results that the "plspm" package
provided in the R language (R-project), using the function plsca(X, Y).
Results are equal or collinear with the function
``pls(..., mode = "canonical")`` of the "mixOmics" package. The difference
relies in the fact that mixOmics implementation does not exactly implement
the Wold algorithm since it does not normalize y_weights to one.

**Examples**

>>> from sklearn.cross_decomposition import PLSCanonical
>>> X = [[0., 0., 1.], [1.,0.,0.], [2.,2.,2.], [2.,5.,4.]]
>>> Y = [[0.1, -0.2], [0.9, 1.1], [6.2, 5.9], [11.9, 12.3]]
>>> plsca = PLSCanonical(n_components=2)
>>> plsca.fit(X, Y)
... # doctest: +NORMALIZE_WHITESPACE
PLSCanonical(algorithm='nipals', copy=True, max_iter=500, n_components=2,
             scale=True, tol=1e-06)
>>> X_c, Y_c = plsca.transform(X, Y)

**References**


Jacob A. Wegelin. A survey of Partial Least Squares (PLS) methods, with
emphasis on the two-block case. Technical Report 371, Department of
Statistics, University of Washington, Seattle, 2000.

Tenenhaus, M. (1998). La regression PLS: theorie et pratique. Paris:

Editions Technic.

See also

CCA
PLSSVD

Overrides: object.__init__

_execute(self, x)

 
Overrides: Node._execute

_get_supported_dtypes(self)

 
Return the list of dtypes supported by this node. The types can be specified in any format allowed by numpy.dtype.
Overrides: Node._get_supported_dtypes

_stop_training(self, **kwargs)

 
Concatenate the collected data in a single array.
Overrides: Node._stop_training

execute(self, x)

 

Apply the dimension reduction learned on the train data.

This node has been automatically generated by wrapping the sklearn.cross_decomposition.pls_.PLSCanonical class from the sklearn library. The wrapped instance can be accessed through the scikits_alg attribute.

Parameters

X : array-like of predictors, shape = [n_samples, p]
Training vectors, where n_samples in the number of samples and p is the number of predictors.
Y : array-like of response, shape = [n_samples, q], optional
Training vectors, where n_samples in the number of samples and q is the number of response variables.
copy : boolean, default True
Whether to copy X and Y, or perform in-place normalization.

Returns

x_scores if Y is not given, (x_scores, y_scores) otherwise.

Overrides: Node.execute

is_invertible()
Static Method

 
Return True if the node can be inverted, False otherwise.
Overrides: Node.is_invertible
(inherited documentation)

is_trainable()
Static Method

 
Return True if the node can be trained, False otherwise.
Overrides: Node.is_trainable

stop_training(self, **kwargs)

 

Fit model to data.

This node has been automatically generated by wrapping the sklearn.cross_decomposition.pls_.PLSCanonical class from the sklearn library. The wrapped instance can be accessed through the scikits_alg attribute.

Parameters

X : array-like, shape = [n_samples, n_features]
Training vectors, where n_samples in the number of samples and n_features is the number of predictors.
Y : array-like of response, shape = [n_samples, n_targets]
Target vectors, where n_samples in the number of samples and n_targets is the number of response variables.
Overrides: Node.stop_training