A spectral algorithm for learning mixture models

نویسندگان

  • Santosh Vempala
  • Grant Wang
چکیده

A mixture model is a weighted combination of probability distributions. We consider the problem of identifying the component distributions of a mixture model by examining random samples from the mixture. Our main result is that a simple spectral algorithm for learning a mixture of k spherical Gaussians in n-dimensions works remarkably well — it succeeds in identifying the Gaussians assuming essentially the minimum possible separation between their centers that keeps them unique. Unlike existing algorithms, the sample complexity and running time are polynomial in both n and k. We then apply it to the more general problem of learning a mixture of weakly isotropic distributions (e.g. a mixture of uniform distributions on cubes). The algorithm is robust in that it can tolerate small amounts of noise and thus can also be used for the more general problem of finding the best mixture model that fits a given data set, provided there exists a good fit.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Beyond Gaussians: Spectral Methods for Learning Mixtures of Heavy-Tailed Distributions

We study the problem of learning mixtures of distributions, a natural formalization of clustering. A mixture of distributions is a collection of distributions D = {D1, . . . , DT } and weights w1, . . . , wT . A sample from a mixture is drawn by selecting Di with probability wi and then selecting a sample from Di. The goal, in learning a mixture, is to learn the parameters of the distributions ...

متن کامل

The Spectral Method for General Mixture Models

We present an algorithm for learning a mixture of distributions based on spectral projection. We prove a general property of spectral projection for arbitrary mixtures and show that the resulting algorithm is efficient when the components of the mixture are logconcave distributions in n whose means are separated. The separation required grows with k, the number of components, and logn. This is ...

متن کامل

­­Image Segmentation using Gaussian Mixture Model

Abstract: Stochastic models such as mixture models, graphical models, Markov random fields and hidden Markov models have key role in probabilistic data analysis. In this paper, we used Gaussian mixture model to the pixels of an image. The parameters of the model were estimated by EM-algorithm.   In addition pixel labeling corresponded to each pixel of true image was made by Bayes rule. In fact,...

متن کامل

Recognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model

Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....

متن کامل

Cloud masking in remotely sensed hyperspectral images using linear and nonlinear spectral mixture analysis

In this paper, we analyze the effectiveness of spectral mixture techniques in the generation of a cloud abundance mask. Two different mixture models are considered: linear and nonlinear. The linear model first identifies pure spectral constituents (endmembers) and then expresses mixed pixels as linear combination of endmembers. It is clear that there are naturally occurring situations where non...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. Comput. Syst. Sci.

دوره 68  شماره 

صفحات  -

تاریخ انتشار 2004