Dirichlet Process Mixtures of Generalized Mallows Models

نویسندگان

  • Marina Meila
  • Harr Chen
چکیده

We present a Dirichlet process mixture model over discrete incomplete rankings and study two Gibbs sampling inference techniques for estimating posterior clusterings. The first approach uses a slice sampling subcomponent for estimating cluster parameters. The second approach marginalizes out several cluster parameters by taking advantage of approximations to the conditional posteriors. We empirically demonstrate (1) the effectiveness of this approximation for improving convergence, (2) the benefits of the Dirichlet process model over alternative clustering techniques for ranked data, and (3) the applicability of the approach to exploring large realworld ranking datasets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dirichlet Process Mixtures of Generalized Linear Models

We propose Dirichlet Process mixtures of Generalized Linear Models (DP-GLM), a new class of methods for nonparametric regression. Given a data set of input-response pairs, the DP-GLM produces a global model of the joint distribution through a mixture of local generalized linear models. DP-GLMs allow both continuous and categorical inputs, and can model the same class of responses that can be mo...

متن کامل

Learning Mallows Models with Pairwise Preferences

Learning preference distributions is a key problem in many areas (e.g., recommender systems, IR, social choice). However, many existing methods require restrictive data models for evidence about user preferences. We relax these restrictions by considering as data arbitrary pairwise comparisons—the fundamental building blocks of ordinal rankings. We develop the first algorithms for learning Mall...

متن کامل

Effective sampling and learning for mallows models with pairwise-preference data

Learning preference distributions is a critical problem in many areas (e.g., recommender systems, IR, social choice). However, many existing learning and inference methods impose restrictive assumptions on the form of user preferences that can be admitted as evidence. We relax these restrictions by considering as data arbitrary pairwise comparisons of alternatives, which represent the fundament...

متن کامل

Preferences in college applications - A non-parametric Bayesian analysis of top-10 rankings

Applicants to degree courses in Irish colleges and universities rank up to ten degree courses from a list of over five hundred. These data provide a wealth of information concerning applicant degree choices. A Dirichlet process mixture of generalized Mallows models are used to explore data from a cohort of applicants. We find strong and diverse clusters, which in turn gains us important insight...

متن کامل

Multiple Orderings of Events in Disease Progression

The event-based model constructs a discrete picture of disease progression from cross-sectional data sets, with each event corresponding to a new biomarker becoming abnormal. However, it relies on the assumption that all subjects follow a single event sequence. This is a major simplification for sporadic disease data sets, which are highly heterogeneous, include distinct subgroups, and contain ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010