- n_clusters : int, optional, default: 8
- The number of clusters to form as well as the number of
centroids to generate.
- max_iter : int, optional
- Maximum number of iterations over the complete dataset before
stopping independently of any early stopping criterion heuristics.
- max_no_improvement : int, default: 10
Control early stopping based on the consecutive number of mini
batches that does not yield an improvement on the smoothed inertia.
To disable convergence detection based on inertia, set
max_no_improvement to None.
- tol : float, default: 0.0
Control early stopping based on the relative center changes as
measured by a smoothed, variance-normalized of the mean center
squared position changes. This early stopping heuristics is
closer to the one used for the batch variant of the algorithms
but induces a slight computational and memory overhead over the
inertia heuristic.
To disable convergence detection based on normalized center
change, set tol to 0.0 (default).
- batch_size : int, optional, default: 100
- Size of the mini batches.
- init_size : int, optional, default: 3 * batch_size
- Number of samples to randomly sample for speeding up the
initialization (sometimes at the expense of accuracy): the
only algorithm is initialized by running a batch KMeans on a
random subset of the data. This needs to be larger than n_clusters.
- init : {'k-means++', 'random' or an ndarray}, default: 'k-means++'
Method for initialization, defaults to 'k-means++':
'k-means++' : selects initial cluster centers for k-mean
clustering in a smart way to speed up convergence. See section
Notes in k_init for more details.
'random': choose k observations (rows) at random from data for
the initial centroids.
If an ndarray is passed, it should be of shape (n_clusters, n_features)
and gives the initial centers.
- n_init : int, default=3
- Number of random initializations that are tried.
In contrast to KMeans, the algorithm is only run once, using the
best of the n_init initializations as measured by inertia.
- compute_labels : boolean, default=True
- Compute label assignment and inertia for the complete dataset
once the minibatch optimization has converged in fit.
- random_state : integer or numpy.RandomState, optional
- The generator used to initialize the centers. If an integer is
given, it fixes the seed. Defaults to the global numpy random
number generator.
- reassignment_ratio : float, default: 0.01
- Control the fraction of the maximum number of counts for a
center to be reassigned. A higher value means that low count
centers are more easily reassigned, which means that the
model will take longer to converge, but should converge in a
better clustering.
- verbose : boolean, optional
- Verbosity mode.
|
__init__(self,
input_dim=None,
output_dim=None,
dtype=None,
**kwargs)
Mini-Batch K-Means clustering |
|
|
|
|
|
_get_supported_dtypes(self)
Return the list of dtypes supported by this node.
The types can be specified in any format allowed by numpy.dtype. |
|
|
|
_stop_training(self,
**kwargs)
Concatenate the collected data in a single array. |
|
|
|
execute(self,
x)
Transform X to a cluster-distance space. |
|
|
|
stop_training(self,
**kwargs)
Compute the centroids on X by chunking it into mini-batches. |
|
|
Inherited from unreachable.newobject :
__long__ ,
__native__ ,
__nonzero__ ,
__unicode__ ,
next
Inherited from object :
__delattr__ ,
__format__ ,
__getattribute__ ,
__hash__ ,
__new__ ,
__reduce__ ,
__reduce_ex__ ,
__setattr__ ,
__sizeof__ ,
__subclasshook__
|
|
_train(self,
*args)
Collect all input data in a list. |
|
|
|
train(self,
*args)
Collect all input data in a list. |
|
|
|
|
|
__call__(self,
x,
*args,
**kwargs)
Calling an instance of Node is equivalent to calling
its execute method. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
_refcast(self,
x)
Helper function to cast arrays to the internal dtype. |
|
|
|
|
|
|
|
|
|
copy(self,
protocol=None)
Return a deep copy of the node. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
inverse(self,
y,
*args,
**kwargs)
Invert y . |
|
|
|
is_training(self)
Return True if the node is in the training phase,
False otherwise. |
|
|
|
save(self,
filename,
protocol=-1)
Save a pickled serialization of the node to filename .
If filename is None, return a string. |
|
|
|
set_dtype(self,
t)
Set internal structures' dtype. |
|
|
|
|
|
|