نتایج جستجو برای: instance clustering

تعداد نتایج: 178323  

2005
Ashish Vaswani

So far, we have mostly talked about communities in the sense of discovering one, or a few, densely linked subgraphs. We departed from this interpretation at the end of last lecture, when we defined the notion of the modularity of a clustering. There, we are interested in the division of a graph into disjoint partitions (or clusters) of nodes, and the quality of this clustering. Clustering of da...

2017
Flavio Chierichetti Ravi Kumar Silvio Lattanzi Sergei Vassilvitskii

We study the question of fair clustering under the disparate impact doctrine, where each protected class must have approximately equal representation in every cluster. We formulate the fair clustering problem under both the k-center and the k-median objectives, and show that even with two protected classes the problem is challenging, as the optimum solution can violate common conventions—for in...

2009
João M. M. Duarte Ana L. N. Fred F. Jorge F. Duarte

Recent work has focused the incorporation of a priori knowledge into the data clustering process, in the form of pairwise constraints, aiming to improve clustering quality and find appropriate clustering solutions to specific tasks or interests. In this work, we integrate must-link and cannot-link constraints into the cluster ensemble framework. Two algorithms for combining multiple data partit...

2013
Yuri Malitsky Ashish Sabharwal Horst Samulowitz Meinolf Sellmann

Different solution approaches for combinatorial problems often exhibit incomparable performance that depends on the concrete problem instance to be solved. Algorithm portfolios aim to combine the strengths of multiple algorithmic approaches by training a classifier that selects or schedules solvers dependent on the given instance. We devise a new classifier that selects solvers based on a cost-...

2014
Irene Diaz-Valenzuela Maria J. Martín-Bautista M. Amparo Vila

This paper introduces a new method of fuzzy semisupervised hierarchical clustering using fuzzy instance level constraints. It introduces the concepts of fuzzy must-link and fuzzy cannot-link constraints and use them to find the optimum α-cut of a dendrogram. This method is used to approach the problem of classifying scientific publications in web digital libraries. It is tested on real data fro...

2006
Ergun Biçici Deniz Yuret

We focus on answering word analogy questions by using clustering techniques. The increased performance in answering word similarity questions can have many possible applications, including question answering and information retrieval. We present an analysis of clustering algorithms’ performance on answering word similarity questions. This paper’s contributions can be summarized as: (i) casting ...

2011
Meihong Wang Fei Sha

We propose techniques of convex optimization for information theoretical clustering. The clustering objective is to maximize the mutual information between data points and cluster assignments. We formulate this problem first as an instance of max k cut on weighted graphs. We then apply the technique of semidefinite programming (SDP) relaxation to obtain a convex SDP problem. We show how the sol...

2015
Konstantin Makarychev Yury Makarychev Aravindan Vijayaraghavan

In this paper, we propose and study a semi-random model for the Correlation Clustering problem on arbitrary graphs G. We give two approximation algorithms for Correlation Clustering instances from this model. The first algorithm finds a solution of value (1 + δ) opt-cost +Oδ(n log n) with high probability, where opt-cost is the value of the optimal solution (for every δ > 0). The second algorit...

2001
Qi Zhang Sally A. Goldman

We present a new multiple-instance (MI) learning technique (EMDD) that combines EM with the diverse density (DD) algorithm. EM-DD is a general-purpose MI algorithm that can be applied with boolean or real-value labels and makes real-value predictions. On the boolean Musk benchmarks, the EM-DD algorithm without any tuning significantly outperforms all previous algorithms. EM-DD is relatively ins...

2000
Huiru Zheng Sarabjot Singh Anand John G Hughes Norman D Black

Isolation and purification of the active principle within natural compounds plays an important role in drug development. MS (mass spectrometry) is used as a detector in HPLC (high performance liquid chromatography) systems to aid the determination of novel compound structures. Clustering techniques provide useful tools for intelligent data analysis within this context. In this paper, we analyse...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید