Package mdp :: Package utils

Package utils

Classes

CovarianceMatrix
This class stores an empirical covariance matrix that can be updated incrementally. A call to the 'fix' method returns the current state of the covariance matrix, the average and the number of observations, and resets the internal data.

CrossCovarianceMatrix

DelayCovarianceMatrix
This class stores an empirical covariance matrix between the signal and time delayed signal that can be updated incrementally.

HTMLSlideShow
Abstract slideshow base class.

ImageHTMLSlideShow
Slideshow for images.

MultipleCovarianceMatrices
Container class for multiple covariance matrices to easily execute operations on all matrices at the same time. Note: all operations are done in place where possible.

OrderedDict
Dictionary that remembers insertion order

QuadraticForm
Define an inhomogeneous quadratic form as 1/2 x'Hx + f'x + c . This class implements the quadratic form analysis methods presented in:

QuadraticFormException

SectionHTMLSlideShow
Astract slideshow with additional support for section markers.

SectionImageHTMLSlideShow
Image slideshow with section markers.

TemporaryDirectory
Create and return a temporary directory. This has the same behavior as mkdtemp but can be used as a context manager. For example:

Functions

[hide private]

_fixup_namespace_item(parent, mname, name, old_modules, path)

_without_prefix(name, prefix)

basic_css()
Return the basic default CSS.

bool_to_sign(an_array)
Return -1 for each False; +1 for each True

comb(N, k)
Return number of combinations of k objects from a set of N objects without repetitions, a.k.a. the binomial coefficient of N and k.

cov2(x, y)
Compute the covariance between 2D matrices x and y. Complies with the old scipy.cov function: different variables are on different columns.

dig_node(x)
Crawl recursively an MDP Node looking for arrays.

fixup_namespace(mname, names, old_modules, keep_modules=())
Update __module__ attribute and remove old_modules from namespace

gabor(size, alpha, phi, freq, sgm, x0=None, res=1, ampl=1.0)
Return a 2D array containing a Gabor wavelet.

get_dtypes(typecodes_key, _safe=True)
Return the list of dtypes corresponding to the set of typecodes defined in numpy.typecodes[typecodes_key]. E.g., get_dtypes('Float') = [dtype('f'), dtype('d'), dtype('g')].

get_node_size(x)
Return node total byte-size using cPickle with protocol=2.

hermitian(x)
Compute the Hermitian, i.e. conjugate transpose, of x.

image_slideshow(filenames, image_size, title=None, section_ids=None, delay=100, delay_delta=20, loop=True, slideshow_id=None, magnification=1, mag_control=True, shortcuts=True)
Return a string with the JS and HTML code for an image slideshow.

image_slideshow_css()
Use nearest neighbour resampling in Firefox 3.6+ and IE.

inv(x)

irep(x, n, dim)
Replicate x n-times on a new dimension dim-th dimension

izip_stretched(*iterables)
Same as izip, except that for convenience non-iterables are repeated ad infinitum.

lrep(x, n)
Replicate x n-times on a new first dimension

matmult(a, b, alpha=1.0, beta=0.0, c=None, trans_a=0, trans_b=0)
Return alpha*(a*b) + beta*c.

mult(a, b, out=None)
Dot product of two arrays.

mult_diag(d, mtx, left=True)
Multiply a full matrix by a diagonal matrix. This function should always be faster than dot.

nongeneral_svd(A, range=None, **kwargs)
SVD routine for simple eigenvalue problem, API is compatible with symeig.

norm2(v)
Compute the 2-norm for 1D arrays. norm2(v) = sqrt(sum(v_i^2))

orthogonal_permutations(a_dict)
Takes a dictionary with lists as keys and returns all permutations of these list elements in new dicts.

permute(x, indices=(0, 0), rows=0, cols=1)
Swap two columns and (or) two rows of 'x', whose indices are specified in indices=[i,j]. Note: permutations are done in-place. You'll lose your original matrix

pinv(x)

progressinfo(sequence, length=None, style='bar', custom=None)
A fully configurable text-mode progress info box tailored to the command-line die-hards.

random_rot(dim, dtype='d')
Return a random rotation matrix, drawn from the Haar distribution (the only uniform distribution on SO(n)). The algorithm is described in the paper Stewart, G.W., "The efficient generation of random orthogonal matrices with an application to condition estimators", SIAM Journal on Numerical Analysis, 17(3), pp. 403-409, 1980. For more information see http://en.wikipedia.org/wiki/Orthogonal_matrix#Randomization

refcast(array, dtype)
Cast the array to dtype only if necessary, otherwise return a reference.

rotate(mat, angle, columns=(0, 1), units='radians')
Rotate in-place data matrix (NxM) in the plane defined by the columns=[i,j] when observation are stored on rows. Observations are rotated counterclockwise. This corresponds to the following matrix-multiplication for each data-point (unchanged elements omitted):

rrep(x, n)
Replicate x n-times on a new last dimension

scast(scalar, dtype)
Convert a scalar in a 0D array of the given dtype.

sign_to_bool(an_array, zero=True)
Return False for each negative value, else True.

slideshow_css()
Return the additional CSS for a slideshow.

solve(x, y)

sqrtm(A)
This is a symmetric definite positive matrix sqrt function

svd(x, compute_uv=True)
Wrap the numx SVD routine, so that it returns arrays of the correct dtype and a SymeigException in case of failures.

symrand(dim_or_eigv, dtype='d')
Return a random symmetric (Hermitian) matrix.

timediff(data)
Returns the array of the time differences of data.

weighted_choice(a_dict, normalize=True)
Returns a key from a dictionary based on the weight that the value suggests. If 'normalize' is False, it is assumed the weights sum up to unity. Otherwise, the algorithm will take care of normalising.

Variables

[hide private]

FIXUP_DEBUG = None
hash(x)

__package__ = 'mdp.utils'

Function Details

[hide private]

_fixup_namespace_item(parent, mname, name, old_modules, path)

_without_prefix(name, prefix)

basic_css()

Return the basic default CSS.

bool_to_sign(an_array)

Return -1 for each False; +1 for each True

comb(N, k)

Return number of combinations of k objects from a set of N objects without repetitions, a.k.a. the binomial coefficient of N and k.

cov2(x, y)

Compute the covariance between 2D matrices x and y. Complies with the old scipy.cov function: different variables are on different columns.

dig_node(x)

Crawl recursively an MDP Node looking for arrays.

Return (dictionary, string), where the dictionary is: { attribute_name: (size_in_bytes, array_reference)} and string is a nice string representation of it.

fixup_namespace(mname, names, old_modules, keep_modules=`()`)

Update __module__ attribute and remove old_modules from namespace

When classes are imported from implementation modules into the package exporting them, the __module__ attribute reflects the place of definition. Splitting the code into separate files (and thus modules) makes the implementation managable. Nevertheless, we do not want the implementation modules to be visible and delete their names from the package's namespace. This causes some problems: when looking at the exported classes and other objects, their __module__ attribute points to something non-importable, repr output and documentation do not show the module from which they are supposed to be imported. The documentation generators like epydoc and sphinx are also confused. To alleviate those problems, the __module__ attributes of all exported classes defined in a "private" module and then exported elsewhere are changed to the latter.

For each name in names, if <mname>.<name> is accessible, and if its __module__ attribute is equal to one of the names in old_modules, it is changed to "<mname>". In other words, all the __module__ attributes of objects exported from module <mname> are updated, iff they used to point to one of the "private" modules in old_modules.

This operation is performed not only for classes, but actually for all objects with the __module__ attribute, following the rules stated above. The operation is also performed recursively, not only for names in names, but also for methods, inner classes, and other attributes. This recursive invocation is necessary because all the problems affecting top-level exported classes also affect their attributes visible for the user, and especially documented functions.

If names is None, all public names in module <mname> (not starting with '_') are affected.

After the __module__ attributes are changed, "private" modules given in old_modules, except for the ones in keep_modules, are deleted from the namespace of <mname> module.

gabor(size, alpha, phi, freq, sgm, x0=None, res=1, ampl=1.0)

Return a 2D array containing a Gabor wavelet.

Input arguments:
size -- (height, width) (pixels)
alpha -- orientation (rad)
phi -- phase (rad)
freq -- frequency (cycles/deg)
sgm -- (sigma_x, sigma_y) standard deviation along the axis
       of the gaussian ellipse (pixel)
x0 -- (x,y) coordinates of the center of the wavelet (pixel)
      Default: None, meaning the center of the array
res -- spatial resolution (deg/pixel)
       Default: 1, so that 'freq' is measured in cycles/pixel
ampl -- constant multiplying the result
        Default: 1.

get_dtypes(typecodes_key, _safe=True)

Return the list of dtypes corresponding to the set of typecodes defined in numpy.typecodes[typecodes_key]. E.g., get_dtypes('Float') = [dtype('f'), dtype('d'), dtype('g')].

If _safe is True (default), we remove large floating point types if the numerical backend does not support them.

get_node_size(x)

Return node total byte-size using cPickle with protocol=2.

The byte-size is related to the memory needed by the node).

hermitian(x)

Compute the Hermitian, i.e. conjugate transpose, of x.

image_slideshow(filenames, image_size, title=None, section_ids=None, delay=100, delay_delta=20, loop=True, slideshow_id=None, magnification=1, mag_control=True, shortcuts=True)

Return a string with the JS and HTML code for an image slideshow.

Note that the CSS code for the slideshow is not included, so you should
add SLIDESHOW_STYLE or a custom style to your CSS code.

filenames -- Sequence of the image filenames.
image_size -- Tuple (x,y) with the original image size, or enter
    a different size to force scaling.
title -- Optional slideshow title (for default None not title is shown).
section_ids -- List with the section id for each slide index. The id
        can be a string or a number. Default value None disables the
        section feature.

For additional keyword arguments see the ImageHTMLSlideShow class.

image_slideshow_css()

Use nearest neighbour resampling in Firefox 3.6+ and IE.

Webkit (Chrome, Safari) does not support this yet. (see http://code.google.com/p/chromium/issues/detail?id=1502)

inv(x)

irep(x, n, dim)

Replicate x n-times on a new dimension dim-th dimension

izip_stretched(*iterables)

Same as izip, except that for convenience non-iterables are repeated ad infinitum.

This is useful when trying to zip input data with respective labels
and allows for having a single label for all data, as well as for
havning a list of labels for each data vector.
Note that this will take strings as an iterable (of course), so
strings acting as a single value need to be wrapped in a repeat
statement of their own.

Thus,
>>> for zipped in izip_stretched([1, 2, 3], -1):
        print zipped
(1, -1)
(2, -1)
(3, -1)

is equivalent to
>>> for zipped in izip([1, 2, 3], [-1] * 3):
        print zipped
(1, -1)
(2, -1)
(3, -1)

lrep(x, n)

Replicate x n-times on a new first dimension

matmult(a, b, alpha=1.0, beta=0.0, c=None, trans_a=0, trans_b=0)

Return alpha*(a*b) + beta*c.
a,b,c : matrices
alpha, beta: scalars
trans_a : 0 (a not transposed), 1 (a transposed),
          2 (a conjugate transposed)
trans_b : 0 (b not transposed), 1 (b transposed),
          2 (b conjugate transposed)

mult(a, b, out=None)

Dot product of two arrays.

For 2-D arrays it is equivalent to matrix multiplication, and for 1-D
arrays to inner product of vectors (without complex conjugation). For
N dimensions it is a sum product over the last axis of `a` and
the second-to-last of `b`::

    dot(a, b)[i,j,k,m] = sum(a[i,j,:] * b[k,:,m])

Parameters
----------
a : array_like
    First argument.
b : array_like
    Second argument.
out : ndarray, optional
    Output argument. This must have the exact kind that would be returned
    if it was not used. In particular, it must have the right type, must be
    C-contiguous, and its dtype must be the dtype that would be returned
    for `dot(a,b)`. This is a performance feature. Therefore, if these
    conditions are not met, an exception is raised, instead of attempting
    to be flexible.

Returns
-------
output : ndarray
    Returns the dot product of `a` and `b`.  If `a` and `b` are both
    scalars or both 1-D arrays then a scalar is returned; otherwise
    an array is returned.
    If `out` is given, then it is returned.

Raises
------
ValueError
    If the last dimension of `a` is not the same size as
    the second-to-last dimension of `b`.

See Also
--------
vdot : Complex-conjugating dot product.
tensordot : Sum products over arbitrary axes.
einsum : Einstein summation convention.
matmul : '@' operator as method with out parameter.

Examples
--------
>>> np.dot(3, 4)
12

Neither argument is complex-conjugated:

>>> np.dot([2j, 3j], [2j, 3j])
(-13+0j)

For 2-D arrays it is the matrix product:

>>> a = [[1, 0], [0, 1]]
>>> b = [[4, 1], [2, 2]]
>>> np.dot(a, b)
array([[4, 1],
       [2, 2]])

>>> a = np.arange(3*4*5*6).reshape((3,4,5,6))
>>> b = np.arange(3*4*5*6)[::-1].reshape((5,4,6,3))
>>> np.dot(a, b)[2,3,2,1,2,2]
499128
>>> sum(a[2,3,2,:] * b[1,2,:,2])
499128

mult_diag(d, mtx, left=True)

Multiply a full matrix by a diagonal matrix. This function should always be faster than dot.

Input:: d -- 1D (N,) array (contains the diagonal elements) mtx -- 2D (N,N) array
Output:: mult_diag(d, mts, left=True) == dot(diag(d), mtx) mult_diag(d, mts, left=False) == dot(mtx, diag(d))

nongeneral_svd(A, range=None, **kwargs)

SVD routine for simple eigenvalue problem, API is compatible with symeig.

norm2(v)

Compute the 2-norm for 1D arrays. norm2(v) = sqrt(sum(v_i^2))

orthogonal_permutations(a_dict)

Takes a dictionary with lists as keys and returns all permutations of these list elements in new dicts.

This function is useful, when a method with several arguments shall be tested and all of the arguments can take several values.

The order is not defined, therefore the elements should be orthogonal to each other.

>>> for i in orthogonal_permutations({'a': [1,2,3], 'b': [4,5]}):
        print i
{'a': 1, 'b': 4}
{'a': 1, 'b': 5}
{'a': 2, 'b': 4}
{'a': 2, 'b': 5}
{'a': 3, 'b': 4}
{'a': 3, 'b': 5}

permute(x, indices=`(`0`,` 0`)`, rows=0, cols=1)

Swap two columns and (or) two rows of 'x', whose indices are specified in indices=[i,j]. Note: permutations are done in-place. You'll lose your original matrix

pinv(x)

progressinfo(sequence, length=None, style=`'bar'`, custom=None)

A fully configurable text-mode progress info box tailored to the
  command-line die-hards.

  To get a progress info box for your loops use it like this:

     >>> for i in progressinfo(sequence):
     ...     do_something(i)

 You can also use it with generators, files or any other iterable object,
 but in this case you have to specify the total length of the sequence:

     >>> for line in progressinfo(open_file, nlines):
     ...     do_something(line)

 If the number of iterations is not known in advance, you may prefer
 to iterate on the items directly. This can be useful for example if
 you are downloading a big file in a subprocess and want to monitor
 the progress. If the file to be downloaded is TOTAL bytes large and
 you are downloading it on local:
     >>> def done():
     ...     yield os.path.getsize(localfile)
     >>> for bytes in progressinfo(done(), -TOTAL)
     ...     time.sleep(1)
     ...     if download_process_has_finished():
     ...         break


Arguments:

sequence    - if it is a Python container object (list,
              dict, string, etc...) and it supports the
              __len__ method call, the length argument can
              be omitted. If it is an iterator (generators,
              file objects, etc...) the length argument must
              be specified.

Keyword arguments:

length     - length of the sequence. Automatically set
             if `sequence' has the __len__ method. If length is
             negative, iterate on items.

style      - If style == 'bar', display a progress bar. The
             default layout is:

             [===========60%===>.........]

             If style == 'timer', display a time elapsed / time
             remaining info box. The default layout is:

             23% [02:01:28] - [00:12:37]

             where fields have the following meaning:

             percent_done% [time_elapsed] - [time_remaining]

custom     - a dictionary for customizing the layout.
             Default layout for the 'bar' style:
              custom = { 'indent': '',
                         'width' : terminal_width - 1,
                         'position' : 'middle',
                         'delimiters' : '[]',
                         'char1' : '=',
                         'char2' : '>',
                         'char3' : '.' }


             Default layout for the 'timer' style:
              custom = { 'speed': 'mean',
                         'indent': '',
                         'position' : 'left',
                         'delimiters' : '[]',
                         'separator' : ' - ' }

             Description:
               speed = completion time estimation method, must be one of
                       ['mean', 'last']. 'mean' uses average speed, 'last'
                       uses last step speed.
               indent = string used for indenting the progress info box
               position = position of the percent done string,
                          must be one out of ['left', 'middle', 'right']


Note 1: by default sys.stdout is flushed each time a new box is drawn.
        If you need to rely on buffered stdout you'd better not use this
        (any?) progress info box.
Note 2: progressinfo slows down your loops. Always profile your scripts
        and check that you are not wasting 99% of the time in drawing
        the progress info box.

random_rot(dim, dtype=`'d'`)

Return a random rotation matrix, drawn from the Haar distribution (the only uniform distribution on SO(n)). The algorithm is described in the paper Stewart, G.W., "The efficient generation of random orthogonal matrices with an application to condition estimators", SIAM Journal on Numerical Analysis, 17(3), pp. 403-409, 1980. For more information see http://en.wikipedia.org/wiki/Orthogonal_matrix#Randomization

refcast(array, dtype)

Cast the array to dtype only if necessary, otherwise return a reference.

rotate(mat, angle, columns=`(`0`,` 1`)`, units=`'radians'`)

Rotate in-place data matrix (NxM) in the plane defined by the columns=[i,j] when observation are stored on rows. Observations are rotated counterclockwise. This corresponds to the following matrix-multiplication for each data-point (unchanged elements omitted):

[ cos(angle) -sin(angle) [ x_i ]

sin(angle) cos(angle) ] * [ x_j ]

If M=2, columns=[0,1].

rrep(x, n)

Replicate x n-times on a new last dimension

scast(scalar, dtype)

Convert a scalar in a 0D array of the given dtype.

sign_to_bool(an_array, zero=True)

Return False for each negative value, else True.

The value for 0 is specified with 'zero'.

slideshow_css()

Return the additional CSS for a slideshow.

solve(x, y)

sqrtm(A)

This is a symmetric definite positive matrix sqrt function

svd(x, compute_uv=True)

Wrap the numx SVD routine, so that it returns arrays of the correct dtype and a SymeigException in case of failures.

symrand(dim_or_eigv, dtype=`'d'`)

Return a random symmetric (Hermitian) matrix.

If 'dim_or_eigv' is an integer N, return a NxN matrix, with eigenvalues: uniformly distributed on (-1,1).
If 'dim_or_eigv' is 1-D real array 'a', return a matrix whose: eigenvalues are 'a'.

timediff(data)

Returns the array of the time differences of data.

weighted_choice(a_dict, normalize=True)

Returns a key from a dictionary based on the weight that the value suggests. If 'normalize' is False, it is assumed the weights sum up to unity. Otherwise, the algorithm will take care of normalising.

Example: >>> d = {'a': 0.1, 'b': 0.5, 'c': 0.4} >>> weighted_choice(d) # draws 'b':'c':'a' with 5:4:1 probability

TODO: It might be good to either shuffle the order or explicitely specify it, before walking through the items, to minimise possible degeneration.

Variables Details

[hide private]

FIXUP_DEBUG

hash(x)

Value:

None

package

Value:

'mdp.utils'

Package utils

_fixup_namespace_item(parent, mname, name, old_modules, path)

_without_prefix(name, prefix)

basic_css()

bool_to_sign(an_array)

comb(N, k)

cov2(x, y)

dig_node(x)

fixup_namespace(mname, names, old_modules, keep_modules=())

gabor(size, alpha, phi, freq, sgm, x0=None, res=1, ampl=1.0)

get_dtypes(typecodes_key, _safe=True)

get_node_size(x)

hermitian(x)

image_slideshow(filenames, image_size, title=None, section_ids=None, delay=100, delay_delta=20, loop=True, slideshow_id=None, magnification=1, mag_control=True, shortcuts=True)

image_slideshow_css()

inv(x)

irep(x, n, dim)

izip_stretched(*iterables)

lrep(x, n)

matmult(a, b, alpha=1.0, beta=0.0, c=None, trans_a=0, trans_b=0)

mult(a, b, out=None)

mult_diag(d, mtx, left=True)

nongeneral_svd(A, range=None, **kwargs)

norm2(v)

orthogonal_permutations(a_dict)

permute(x, indices=(0, 0), rows=0, cols=1)

pinv(x)

progressinfo(sequence, length=None, style='bar', custom=None)

random_rot(dim, dtype='d')

refcast(array, dtype)

rotate(mat, angle, columns=(0, 1), units='radians')

rrep(x, n)

scast(scalar, dtype)

sign_to_bool(an_array, zero=True)

slideshow_css()

solve(x, y)

sqrtm(A)

svd(x, compute_uv=True)

symrand(dim_or_eigv, dtype='d')

timediff(data)

weighted_choice(a_dict, normalize=True)

FIXUP_DEBUG

__package__

fixup_namespace(mname, names, old_modules, keep_modules=`()`)

permute(x, indices=`(`0`,` 0`)`, rows=0, cols=1)

progressinfo(sequence, length=None, style=`'bar'`, custom=None)

random_rot(dim, dtype=`'d'`)

rotate(mat, angle, columns=`(`0`,` 1`)`, units=`'radians'`)

symrand(dim_or_eigv, dtype=`'d'`)

package