| Home | Trees | Indices | Help |
|
|---|
|
|
Perform Principal Component Analysis using the NIPALS algorithm.
This algorithm is particularyl useful if you have more variable than
observations, or in general when the number of variables is huge and
calculating a full covariance matrix may be unfeasable. It's also more
efficient of the standard PCANode if you expect the number of significant
principal components to be a small. In this case setting output_dim to be
a certain fraction of the total variance, say 90%, may be of some help.
Internal variables of interest:
self.avg -- Mean of the input data (available after training)
self.d -- Variance corresponding to the PCA components
self.v -- Transposed of the projection matrix (available after training)
self.explained_variance -- When output_dim has been specified as a fraction
of the total variance, this is the fraction
of the total variance that is actually explained
Reference for NIPALS (Nonlinear Iterative Partial Least Squares):
Wold, H.
Nonlinear estimation by iterative least squares procedures
in David, F. (Editor), Research Papers in Statistics, Wiley,
New York, pp 411-444 (1966).
More information about Principal Component Analysis, a.k.a. discrete
Karhunen-Loeve transform can be found among others in
I.T. Jolliffe, Principal Component Analysis, Springer-Verlag (1986).
Original code contributed by:
Michael Schmuker, Susanne Lezius, and Farzad Farkhooi (2008).
|
|||
|
|||
|
|||
|
|||
| Inherited from nodes.PCANode | |||
|---|---|---|---|
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
| Inherited from Node | |||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
| Inherited from Node | |||
|---|---|---|---|
|
_train_seq List of tuples: [(training-phase1, stop-training-phase1), (training-phase2, stop_training-phase2), ... |
|||
|
dtype dtype |
|||
|
input_dim Input dimensions |
|||
|
output_dim Output dimensions |
|||
|
supported_dtypes Supported dtypes |
|||
|
|||
The number of principal components to be kept can be specified as 'output_dim' directly (e.g. 'output_dim=10' means 10 components are kept) or by the fraction of variance to be explained (e.g. 'output_dim=0.95' means that as many components as necessary will be kept in order to explain 95% of the input variance). Other Arguments: conv - convergence threshold for the residual error. max_it - maximum number of iterations
|
Transform the data list to an array object and reshape it.
|
Cumulate all imput data in a one dimensional list.
|
| Home | Trees | Indices | Help |
|
|---|
| Generated by Epydoc 3.0.1 on Thu May 15 15:13:38 2008 | http://epydoc.sourceforge.net |