منابع مشابه
k-means++ under Approximation Stability
The Lloyd’s algorithm, also known as the k-means algorithm, is one of the most popular algorithms for solving the k-means clustering problem in practice. However, it does not give any performance guarantees. This means that there are datasets on which this algorithm can behave very badly. One reason for poor performance on certain datasets is bad initialization. The following simple sampling ba...
متن کاملStreaming k-means approximation
We provide a clustering algorithm that approximately optimizes the k-means objective, in the one-pass streaming setting. We make no assumptions about the data, and our algorithm is very light-weight in terms of memory, and computation. This setting is applicable to unsupervised learning on massive data sets, or resource-constrained devices. The two main ingredients of our theoretical work are: ...
متن کاملStability of k -Means Clustering
We consider the stability of k-means clustering problems. Clustering stability is a common heuristics used to determine the number of clusters in a wide variety of clustering applications. We continue the theoretical analysis of clustering stability by establishing a complete characterization of clustering stability in terms of the number of optimal solutions to the clustering optimization prob...
متن کاملStability of $K$-Means Clustering
We phrase K-means clustering as an empirical risk minimization procedure over a class HK and explicitly calculate the covering number for this class. Next, we show that stability of K-means clustering is characterized by the geometry of HK with respect to the underlying distribution. We prove that in the case of a unique global minimizer, the clustering solution is stable with respect to comple...
متن کاملFast k-Means Algorithms with Constant Approximation
In this paper we study the k-means clustering problem. It is well-known that the general version of this problem is NP-hard. Numerous approximation algorithms have been proposed for this problem. In this paper, we proposed three constant approximation algorithms for k-means clustering. The first algorithm runs in time O(( k )nd), where k is the number of clusters, n is the size of input points,...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Theoretical Computer Science
سال: 2015
ISSN: 0304-3975
DOI: 10.1016/j.tcs.2015.04.030