نتایج جستجو برای: means

تعداد نتایج: 350089  

Journal: :journal of computer and robotics 0
rasool azimi faculty of computer and information technology engineering, qazvin branch, islamic azad university, qazvin, iran hedieh sajedi department of computer science, college of science, university of tehran, tehran, iran

identifying clusters or clustering is an important aspect of data analysis. it is the task of grouping a set of objects in such a way those objects in the same group/cluster are more similar in some sense or another. it is a main task of exploratory data mining, and a common technique for statistical data analysis this paper proposed an improved version of k-means algorithm, namely persistent k...

Journal: :CoRR 2016
Marco Capó Aritz Pérez Martínez José Antonio Lozano

Due to the progressive growth of the amount of data available in a wide variety of scientific fields, it has become more difficult to manipulate and analyze such information. Even though datasets have grown in size, the K-means algorithm remains as one of the most popular clustering methods, in spite of its dependency on the initial settings and high computational cost, especially in terms of d...

2016
Edo Liberty Ram Sriharsha Maxim Sviridenko

This paper shows that one can be competitive with the kmeans objective while operating online. In this model, the algorithm receives vectors v1, . . . , vn one by one in an arbitrary order. For each vector vt the algorithm outputs a cluster identifier before receiving vt+1. Our online algorithm generates Õ(k) clusters whose k-means cost is Õ(W ∗) where W ∗ is the optimal k-means cost using k cl...

Journal: :PVLDB 2012
Bahman Bahmani Benjamin Moseley Andrea Vattani Ravi Kumar Sergei Vassilvitskii

Over half a century old and showing no signs of aging, k-means remains one of the most popular data processing algorithms. As is well-known, a proper initialization of k-means is crucial for obtaining a good final solution. The recently proposed k-means++ initialization algorithm achieves this, obtaining an initial set of centers that is provably close to the optimum solution. A major downside ...

2009
Nir Ailon Ragesh Jaiswal Claire Monteleoni

We provide a clustering algorithm that approximately optimizes the k-means objective, in the one-pass streaming setting. We make no assumptions about the data, and our algorithm is very light-weight in terms of memory, and computation. This setting is applicable to unsupervised learning on massive data sets, or resource-constrained devices. The two main ingredients of our theoretical work are: ...

Journal: :CoRR 2014
Apoorv Agarwal Anna Choromanska Krzysztof Choromanski

In this paper, we compare three initialization schemes for the KMEANS clustering algorithm: 1) random initialization (KMEANSRAND), 2) KMEANS++, and 3) KMEANSD++. Both KMEANSRAND and KMEANS++ have a major that the value of k needs to be set by the user of the algorithms. (Kang 2013) recently proposed a novel use of determinantal point processes for sampling the initial centroids for the KMEANS a...

Journal: :CoRR 2013
Ragesh Jaiswal Prachi Jain Saumya Yadav

The k-means++ seeding algorithm is one of the most popular algorithms that is used for finding the initial k centers when using the k-means heuristic. The algorithm is a simple sampling procedure and can be described as follows: Pick the first center randomly from among the given points. For i > 1, pick a point to be the i center with probability proportional to the square of the Euclidean dist...

Journal: :Research in Computing Science 2016
Eréndira Rendón Lara Itzel M. Abundez B.

Resumen. Sin lugar a duda el algoritmo K-means es el más utilizado en la comunidad de aprendizaje no supervisado. Desafortunadamente es muy sensible a la selección de los centroides iniciales. Debido a ello, se han propuesto un gran número de métodos para la selección de los centros iniciales. En este artículo se presenta un algoritmo de agrupamiento que tiene como base al algoritmo K-means, en...

Journal: :CoRR 2015
Robert A. Murphy

Utilizing the sample size of a dataset, the random cluster model is employed in order to derive an estimate of the mean number of K-Means clusters to form during classification of a dataset.

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه شیراز - دانشکده علوم 1388

خوشه بندی فرایندی است که در طی آن مجموعه ای از نمونه ها به خوشه هایی تقسیم می شوند که اعضای هرخوشه بیشترین شباهت را به یکدیگر داشته باشند و خوشه های مختلف با یکدیگر بیشترین تفاوت را داشته باشند. خوشه بندی یکی از تکنیک های داده کاوی و آنالیز داده متعارف می باشد. درخوشه بندی داده ها، در مسائل با اندازه داده بزگتر رسیدن به حل بهینه مشکل تر می باشد و در نتیجه مدت زمان لازم برای رسیدت به حل های قابل...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید