نتایج جستجو برای: k means
تعداد نتایج: 702376 فیلتر نتایج به سال:
Mini Batch K-means ([11]) has been proposed as an alternative to the K-means algorithm for clustering massive datasets. The advantage of this algorithm is to reduce the computational cost by not using all the dataset each iteration but a subsample of a fixed size. This strategy reduces the number of distance computations per iteration at the cost of lower cluster quality. The purpose of this pa...
K-means (MacQueen, 1967) [1] is one of the simplest unsupervised learning algorithms that solve the well-known clustering problem. The procedure follows a simple and easy way to classify a given data set to a predefined, say K number of clusters. Determination of K is a difficult job and it is not known that which value of K can partition the objects as per our intuition. To overcome this probl...
This paper presents a generalized version of the conventional k-means clustering algorithm [Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, 1, University of California Press, Berkeley, 1967, p. 281]. Not only is this new one applicable to ellipse-shaped data clusters without dead-unit problem, but also performs correct clustering without pre-assigning the exact...
خوشه بندی فرایندی است که در طی آن مجموعه ای از نمونه ها به خوشه هایی تقسیم می شوند که اعضای هرخوشه بیشترین شباهت را به یکدیگر داشته باشند و خوشه های مختلف با یکدیگر بیشترین تفاوت را داشته باشند. خوشه بندی یکی از تکنیک های داده کاوی و آنالیز داده متعارف می باشد. درخوشه بندی داده ها، در مسائل با اندازه داده بزگتر رسیدن به حل بهینه مشکل تر می باشد و در نتیجه مدت زمان لازم برای رسیدت به حل های قابل...
k-means++ [5] seeding procedure is a simple sampling based algorithm that is used to quickly find k centers which may then be used to start the Lloyd’s method. There has been some progress recently on understanding this sampling algorithm. Ostrovsky et al. [10] showed that if the data satisfies the separation condition that ∆k−1(P ) ∆k(P ) ≥ c (∆i(P ) is the optimal cost w.r.t. i centers, c > 1...
We investigate, theoretically and empirically, the effectiveness of kernel K-means++ samples as landmarks in the Nyström method for low-rank approximation of kernel matrices. Previous empirical studies (Zhang et al., 2008; Kumar et al., 2012) observe that the landmarks obtained using (kernel) K-means clustering define a good lowrank approximation of kernel matrices. However, the existing work d...
The Lloyd’s algorithm, also known as the k-means algorithm, is one of the most popular algorithms for solving the k-means clustering problem in practice. However, it does not give any performance guarantees. This means that there are datasets on which this algorithm can behave very badly. One reason for poor performance on certain datasets is bad initialization. The following simple sampling ba...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید