A possibilistic clustering approach toward generative mixture models

نویسندگان

  • Sotirios Chatzis
  • Gavriil Tsechpenakis
چکیده

Generative mixture models (MMs) provide one of the most popular methodologies for unsupervised data clustering. MMs are formulated on the basis of the assumption that each observation derives from (belongs to) a single cluster. However, in many applications, data may intuitively belong to multiple classes, thus rendering the single-cluster assignment assumptions of MMs irrelevant. Furthermore, even in applications where a single-cluster data assignment is required, the induced multinomial allocation of the modeled data points to the clusters derived by a MM, imposing the constraint that the membership probabilities of a data point across clusters sum to one, makes MMs very vulnerable to the presence of outliers in the clustered data sets, and renders them ineffective in discriminating between cases of equal evidence or ignorance. To resolve these issues, in this paper we introduce a possibilistic formulation of MMs. Possibilistic clustering is a methodology that yields possibilistic data partitions, with the obtained membership values being interpreted as degrees of possibility (compatibilities) of the data points with respect to the various clusters. We provide an efficient maximum-likelihood fitting algorithm for the proposed model, and we conduct an objective evaluation of its efficacy using benchmark data. & 2011 Elsevier Ltd. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Image Segmentation: Type–2 Fuzzy Possibilistic C-Mean Clustering Approach

Image segmentation is an essential issue in image description and classification. Currently, in many real applications, segmentation is still mainly manual or strongly supervised by a human expert, which makes it irreproducible and deteriorating. Moreover, there are many uncertainties and vagueness in images, which crisp clustering and even Type-1 fuzzy clustering could not handle. Hence, Type-...

متن کامل

Variational Deep Embedding: An Unsupervised and Generative Approach to Clustering

Clustering is among the most fundamental tasks in computer vision and machine learning. In this paper, we propose Variational Deep Embedding (VaDE), a novel unsupervised generative clustering approach within the framework of Variational Auto-Encoder (VAE). Specifically, VaDE models the data generative procedure with a Gaussian Mixture Model (GMM) and a deep neural network (DNN): 1) the GMM pick...

متن کامل

Non-parametric Mixture Models for Clustering

Mixture models have been widely used for data clustering. However, commonly used mixture models are generally of a parametric form (e.g., mixture of Gaussian distributions or GMM), which significantly limits their capacity in fitting diverse multidimensional data distributions encountered in practice. We propose a non-parametric mixture model (NMM) for data clustering in order to detect cluster...

متن کامل

Multiplicative Mixture Models Approximate Maximum Likelihood Parameter Estimation For The Multiplicative Mixture Model And Overlapping Clustering

Most parametric clustering algorithms in use today employ generative models that do not have a natural mechanism to give rise to overlapping clusters. The multiplicative mixture model has been recently proposed as a generative model that can naturally give rise to overlapping clusters. However, performing maximum likelihood parameter estimation for this model using a standard technique like exp...

متن کامل

A Comparative Study of Generative Models for Document Clustering

Generative models based on the multivariate Bernoulli and multinomial distributions have been widely used for text classification. Recently, the spherical k-means algorithm, which has desirable properties for text clustering, has been shown to be a special case of a generative model based on a mixture of von Mises-Fisher (vMF) distributions. This paper compares these three probabilistic models ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Pattern Recognition

دوره 45  شماره 

صفحات  -

تاریخ انتشار 2012