Information Bottleneck Co-clustering

نویسندگان

  • Pu Wang
  • Carlotta Domeniconi
  • Kathryn Blackmond Laskey
چکیده

Co-clustering has emerged as an important approach for mining contingency data matrices. We present a novel approach to co-clustering based on the Information Bottleneck principle, called Information Bottleneck Co-clustering (IBCC), which supports both soft-partition and hardpartition co-clusterings, and leverages an annealing-style strategy to bypass local optima. Existing co-clustering methods require the user to define the number of rowand column-clusters respectively. In practice, though, the number of rowand column-clusters may not be independent. To address this issue, we also present an agglomerative Information Bottleneck Co-clustering (aIBCC) approach, which automatically captures the relation between the numbers of clusters. The experimental results demonstrate the effectiveness and efficiency of our techniques.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A fuzzy co-clustering algorithm for biomedical data

Fuzzy co-clustering extends co-clustering by assigning membership functions to both the objects and the features, and is helpful to improve clustering accurarcy of biomedical data. In this paper, we introduce a new fuzzy co-clustering algorithm based on information bottleneck named ibFCC. The ibFCC formulates an objective function which includes a distance function that employs information bott...

متن کامل

Co-Clustering via Information-Theoretic Markov Aggregation

We present an information-theoretic cost function for co-clustering, i.e., for simultaneous clustering of two sets based on similarities between their elements. By constructing a simple random walk on the corresponding bipartite graph, our cost function is derived from a recently proposed generalized framework for information-theoretic Markov chain aggregation. The goal of our cost function is ...

متن کامل

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...

متن کامل

Information Bottleneck for Non Co-Occurrence Data

We present a general model-independent approach to the analysis of data in cases when these data do not appear in the form of co-occurrence of two variables X,Y , but rather as a sample of values of an unknown (stochastic) function Z(X,Y ). For example, in gene expression data, the expression level Z is a function of gene X and condition Y ; or in movie ratings data the rating Z is a function o...

متن کامل

An Analysis of Model-based Clustering, Competitive Learning, and Information Bottleneck

This paper provides a general formulation of probabilistic model-based clustering with deterministic annealing (DA), which leads to a unifying analysis of k-means, EM clustering, soft competitive learning algorithms (e.g., self-organizing map), and information bottleneck. The analysis points out an interesting yet not well-recognized connection between the k-means and EM clustering—they are jus...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010