نتایج جستجو برای: instance clustering
تعداد نتایج: 178323 فیلتر نتایج به سال:
So far, we have mostly talked about communities in the sense of discovering one, or a few, densely linked subgraphs. We departed from this interpretation at the end of last lecture, when we defined the notion of the modularity of a clustering. There, we are interested in the division of a graph into disjoint partitions (or clusters) of nodes, and the quality of this clustering. Clustering of da...
We study the question of fair clustering under the disparate impact doctrine, where each protected class must have approximately equal representation in every cluster. We formulate the fair clustering problem under both the k-center and the k-median objectives, and show that even with two protected classes the problem is challenging, as the optimum solution can violate common conventions—for in...
Recent work has focused the incorporation of a priori knowledge into the data clustering process, in the form of pairwise constraints, aiming to improve clustering quality and find appropriate clustering solutions to specific tasks or interests. In this work, we integrate must-link and cannot-link constraints into the cluster ensemble framework. Two algorithms for combining multiple data partit...
Different solution approaches for combinatorial problems often exhibit incomparable performance that depends on the concrete problem instance to be solved. Algorithm portfolios aim to combine the strengths of multiple algorithmic approaches by training a classifier that selects or schedules solvers dependent on the given instance. We devise a new classifier that selects solvers based on a cost-...
This paper introduces a new method of fuzzy semisupervised hierarchical clustering using fuzzy instance level constraints. It introduces the concepts of fuzzy must-link and fuzzy cannot-link constraints and use them to find the optimum α-cut of a dendrogram. This method is used to approach the problem of classifying scientific publications in web digital libraries. It is tested on real data fro...
We focus on answering word analogy questions by using clustering techniques. The increased performance in answering word similarity questions can have many possible applications, including question answering and information retrieval. We present an analysis of clustering algorithms’ performance on answering word similarity questions. This paper’s contributions can be summarized as: (i) casting ...
We propose techniques of convex optimization for information theoretical clustering. The clustering objective is to maximize the mutual information between data points and cluster assignments. We formulate this problem first as an instance of max k cut on weighted graphs. We then apply the technique of semidefinite programming (SDP) relaxation to obtain a convex SDP problem. We show how the sol...
In this paper, we propose and study a semi-random model for the Correlation Clustering problem on arbitrary graphs G. We give two approximation algorithms for Correlation Clustering instances from this model. The first algorithm finds a solution of value (1 + δ) opt-cost +Oδ(n log n) with high probability, where opt-cost is the value of the optimal solution (for every δ > 0). The second algorit...
We present a new multiple-instance (MI) learning technique (EMDD) that combines EM with the diverse density (DD) algorithm. EM-DD is a general-purpose MI algorithm that can be applied with boolean or real-value labels and makes real-value predictions. On the boolean Musk benchmarks, the EM-DD algorithm without any tuning significantly outperforms all previous algorithms. EM-DD is relatively ins...
Isolation and purification of the active principle within natural compounds plays an important role in drug development. MS (mass spectrometry) is used as a detector in HPLC (high performance liquid chromatography) systems to aid the determination of novel compound structures. Clustering techniques provide useful tools for intelligent data analysis within this context. In this paper, we analyse...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید