pyMEF: a Python library for mixtures of exponential families

Description

pyMEF is a Python framework allowing to manipulate, learn, simplify and compare mixtures of exponential families. It is designed to ease the use of various exponential families in mixture models.

See also jMEF for a Java implementation of the same kind of library and libmef for a faster C implementation.

What are exponential families?

An exponential family is a generic set of probability distributions that admit the following canonical distribution:

p_F(x; \theta) = \exp \left( \langle t(x) | \theta \rangle - F(\theta) + k(x) \right)

Exponential families are characterized by the log normalizer function F, and include the following well-known distributions: Gaussian (generic, isotropic Gaussian, diagonal Gaussian, rectified Gaussian or Wald distributions, lognormal), Poisson, Bernoulli, binomial, multinomial, Laplacian, Gamma (incl. chi-squared), Beta, exponential, Wishart, Dirichlet, Rayleigh, probability simplex, negative binomial distribution, Weibull, von Mises, Pareto distributions, skew logistic, etc.

Mixtures of exponential families provide a generic framework for handling Gaussian mixture models (GMMs also called MoGs for mixture of Gaussians), mixture of Poisson distributions, and Laplacian mixture models as well.

Tutorials

A generic tutorial on the exponential families and the simplification of mixture models have been made during the workshop Matrix Information Geometries.

More pyMEF specific tutorials are available here:

Module references

pyMEF.MixtureModel(size, efclass, efparam)
pyMEF.Build.KDE(data, efclass, efparam[, ...]) Kernel density estimation (only for Gaussian kernels)
pyMEF.Build.BregmanSoftClustering(data, k, ...)
pyMEF.Simplify.BregmanHardClustering(mixture, k)
pyMEF.Compare.EMD
pyMEF.Compare.KullbackLeibler([count])

Download

Currently, there is no official release of pyMEF, but you can have a look at the public darcs repository.

Bibliography

  • Olivier Schwander, Frank Nielsen, Simplification de modèles de mélange issus d’estimateur par noyau, GRETSI 2011
  • Olivier Schwander and Frank Nielsen, pyMEF - A framework for Exponential Families in Python, in Proceedings of the 2011 IEEE Workshop on Statistical Signal Processing
  • Vincent Garcia, Frank Nielsen, and Richard Nock, Levels of details for Gaussian mixture models, in Proceedings of the Asian Conference on Computer Vision, Xi’an, China, September 2009
  • Frank Nielsen and Vincent Garcia, Statistical exponential families: A digest with flash cards, arXiV, http://arxiv.org/abs/0911.4863, November 2009
  • Frank Nielsen and Richard Nock, Sided and symmetrized Bregman centroids, in IEEE Transactions on Information Theory, 2009, 55, 2048-2059
  • Frank Nielsen, Jean-Daniel Boissonnat and Richard Nock, On Bregman Voronoi diagrams, in ACM-SIAM Symposium on Data Mining, 2007, 746-755
  • A. Banerjee, S. Merugu, I. Dhillon, and J. Ghosh, Clustering with Bregman divergences, in Journal of Machine Learning Research, 2005, 6, 234-245

Contacts

Please send any comment or bug report to Olivier Schwander or Frank Nielsen.

Indices and tables