Perform Principal Component Analysis using the NIPALS algorithm.
This algorithm is particularyl useful if you have more variable than
observations, or in general when the number of variables is huge and
calculating a full covariance matrix may be unfeasable. It's also more
efficient of the standard PCANode if you expect the number of significant
principal components to be a small. In this case setting output_dim to be
a certain fraction of the total variance, say 90%, may be of some help.
Reference for NIPALS (Nonlinear Iterative Partial Least Squares):
Wold, H.
Nonlinear estimation by iterative least squares procedures.
in David, F. (Editor), Research Papers in Statistics, Wiley,
New York, pp 411-444 (1966).
More information about Principal Component Analysis, a.k.a. discrete
Karhunen-Loeve transform can be found among others in
I.T. Jolliffe, Principal Component Analysis, Springer-Verlag (1986).
Original code contributed by:
Michael Schmuker, Susanne Lezius, and Farzad Farkhooi (2008).
|
__init__(self,
input_dim=None,
output_dim=None,
dtype=None,
conv=1e-08,
max_it=100000)
The number of principal components to be kept can be specified as
'output_dim' directly (e.g. 'output_dim=10' means 10 components
are kept) or by the fraction of variance to be explained
(e.g. 'output_dim=0.95' means that as many components as necessary
will be kept in order to explain 95% of the input variance). |
|
|
|
_stop_training(self,
debug=False)
Concatenate the collected data in a single array. |
|
|
|
_train(self,
x)
Collect all input data in a list. |
|
|
|
stop_training(self,
debug=False)
Concatenate the collected data in a single array. |
|
|
|
train(self,
x)
Collect all input data in a list. |
|
|
Inherited from unreachable.newobject :
__long__ ,
__native__ ,
__nonzero__ ,
__unicode__ ,
next
Inherited from object :
__delattr__ ,
__format__ ,
__getattribute__ ,
__hash__ ,
__new__ ,
__reduce__ ,
__reduce_ex__ ,
__setattr__ ,
__sizeof__ ,
__subclasshook__
|
|
|
|
|
|
_execute(self,
x,
n=None)
Project the input on the first 'n' principal components.
If 'n' is not set, use all available components. |
|
|
|
_inverse(self,
y,
n=None)
Project 'y' to the input space using the first 'n' components.
If 'n' is not set, use all available components. |
|
|
|
|
|
execute(self,
x,
n=None)
Project the input on the first 'n' principal components.
If 'n' is not set, use all available components. |
|
|
|
get_explained_variance(self)
Return the fraction of the original variance that can be
explained by self._output_dim PCA components.
If for example output_dim has been set to 0.95, the explained
variance could be something like 0.958...
Note that if output_dim was explicitly set to be a fixed number
of components, there is no way to calculate the explained variance. |
|
|
|
|
|
get_recmatrix(self,
transposed=1)
Return the back-projection matrix (i.e. the reconstruction matrix). |
|
|
|
inverse(self,
y,
n=None)
Project 'y' to the input space using the first 'n' components.
If 'n' is not set, use all available components. |
|
|
|
|
|
__call__(self,
x,
*args,
**kwargs)
Calling an instance of Node is equivalent to calling
its execute method. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
_refcast(self,
x)
Helper function to cast arrays to the internal dtype. |
|
|
|
|
|
|
|
copy(self,
protocol=None)
Return a deep copy of the node. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
is_training(self)
Return True if the node is in the training phase,
False otherwise. |
|
|
|
save(self,
filename,
protocol=-1)
Save a pickled serialization of the node to filename .
If filename is None, return a string. |
|
|
|
set_dtype(self,
t)
Set internal structures' dtype. |
|
|
|
|
|
|