| Home | Trees | Indices | Help |
|
|---|
|
|
Transforms lists of feature-value mappings to vectors.
This node has been automatically generated by wrapping the ``sklearn.feature_extraction.dict_vectorizer.DictVectorizer`` class
from the ``sklearn`` library. The wrapped instance can be accessed
through the ``scikits_alg`` attribute.
This transformer turns lists of mappings (dict-like objects) of feature
names to feature values into Numpy arrays or scipy.sparse matrices for use
with scikit-learn estimators.
When feature values are strings, this transformer will do a binary one-hot
(aka one-of-K) coding: one boolean-valued feature is constructed for each
of the possible string values that the feature can take on. For instance,
a feature "f" that can take on the values "ham" and "spam" will become two
features in the output, one signifying "f=ham", the other "f=spam".
Features that do not occur in a sample (mapping) will have a zero value
in the resulting array/matrix.
Read more in the :ref:`User Guide <dict_feature_extraction>`.
**Parameters**
dtype : callable, optional
The type of feature values. Passed to Numpy array/scipy.sparse matrix
constructors as the dtype argument.
separator: string, optional
Separator string used when constructing new features for one-hot
coding.
sparse: boolean, optional.
Whether transform should produce scipy.sparse matrices.
True by default.
sort: boolean, optional.
Whether ``feature_names_`` and ``vocabulary_`` should be sorted when fitting.
True by default.
**Attributes**
``vocabulary_`` : dict
A dictionary mapping feature names to feature indices.
``feature_names_`` : list
A list of length n_features containing the feature names (e.g., "f=ham"
and "f=spam").
**Examples**
>>> from sklearn.feature_extraction import DictVectorizer
>>> v = DictVectorizer(sparse=False)
>>> D = [{'foo': 1, 'bar': 2}, {'foo': 3, 'baz': 1}]
>>> X = v.fit_transform(D)
>>> X
array([[ 2., 0., 1.],
[ 0., 1., 3.]])
>>> v.inverse_transform(X) == [{'bar': 2.0, 'foo': 1.0}, {'baz': 1.0, 'foo': 3.0}]
True
>>> v.transform({'foo': 4, 'unseen_feature': 3})
array([[ 0., 0., 4.]])
See also
FeatureHasher : performs vectorization using only a hash function.
sklearn.preprocessing.OneHotEncoder : handles nominal/categorical features
encoded as columns of integers.
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
Inherited from Inherited from |
|||
| Inherited from Cumulator | |||
|---|---|---|---|
|
|||
|
|||
| Inherited from Node | |||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
Inherited from |
|||
| Inherited from Node | |||
|---|---|---|---|
|
_train_seq List of tuples: |
|||
|
dtype dtype |
|||
|
input_dim Input dimensions |
|||
|
output_dim Output dimensions |
|||
|
supported_dtypes Supported dtypes |
|||
|
|||
Transforms lists of feature-value mappings to vectors.
This node has been automatically generated by wrapping the ``sklearn.feature_extraction.dict_vectorizer.DictVectorizer`` class
from the ``sklearn`` library. The wrapped instance can be accessed
through the ``scikits_alg`` attribute.
This transformer turns lists of mappings (dict-like objects) of feature
names to feature values into Numpy arrays or scipy.sparse matrices for use
with scikit-learn estimators.
When feature values are strings, this transformer will do a binary one-hot
(aka one-of-K) coding: one boolean-valued feature is constructed for each
of the possible string values that the feature can take on. For instance,
a feature "f" that can take on the values "ham" and "spam" will become two
features in the output, one signifying "f=ham", the other "f=spam".
Features that do not occur in a sample (mapping) will have a zero value
in the resulting array/matrix.
Read more in the :ref:`User Guide <dict_feature_extraction>`.
**Parameters**
dtype : callable, optional
The type of feature values. Passed to Numpy array/scipy.sparse matrix
constructors as the dtype argument.
separator: string, optional
Separator string used when constructing new features for one-hot
coding.
sparse: boolean, optional.
Whether transform should produce scipy.sparse matrices.
True by default.
sort: boolean, optional.
Whether ``feature_names_`` and ``vocabulary_`` should be sorted when fitting.
True by default.
**Attributes**
``vocabulary_`` : dict
A dictionary mapping feature names to feature indices.
``feature_names_`` : list
A list of length n_features containing the feature names (e.g., "f=ham"
and "f=spam").
**Examples**
>>> from sklearn.feature_extraction import DictVectorizer
>>> v = DictVectorizer(sparse=False)
>>> D = [{'foo': 1, 'bar': 2}, {'foo': 3, 'baz': 1}]
>>> X = v.fit_transform(D)
>>> X
array([[ 2., 0., 1.],
[ 0., 1., 3.]])
>>> v.inverse_transform(X) == [{'bar': 2.0, 'foo': 1.0}, {'baz': 1.0, 'foo': 3.0}]
True
>>> v.transform({'foo': 4, 'unseen_feature': 3})
array([[ 0., 0., 4.]])
See also
FeatureHasher : performs vectorization using only a hash function.
sklearn.preprocessing.OneHotEncoder : handles nominal/categorical features
encoded as columns of integers.
|
|
|
|
Transform feature->value dicts to array or sparse matrix. This node has been automatically generated by wrapping the sklearn.feature_extraction.dict_vectorizer.DictVectorizer class from the sklearn library. The wrapped instance can be accessed through the scikits_alg attribute. Named features not encountered during fit or fit_transform will be silently ignored. Parameters
y : (ignored) Returns
|
|
|
Learn a list of feature name -> indices mappings. This node has been automatically generated by wrapping the sklearn.feature_extraction.dict_vectorizer.DictVectorizer class from the sklearn library. The wrapped instance can be accessed through the scikits_alg attribute. Parameters
y : (ignored) Returns self
|
| Home | Trees | Indices | Help |
|
|---|
| Generated by Epydoc 3.0.1 on Tue Mar 8 12:39:48 2016 | http://epydoc.sourceforge.net |