kmeans clustering

Comprehensive Research on Privacy Preserving Emphasizing on Distributed Clustering

2016

Sumana

Often, the information is sensitive or private in nature and these sensitive data when mined violates the privacy of the individuals. Privacy preserving data mining (PPDM) mines the data but intends to preserve the privacy of susceptible data without ever actually seeing it. This paper recaps the important techniques in PPDM like anonymization, perturbation and cryptography. Nowadays, data mini...

متن کامل

Image Clustering using Color, Texture and Shape Features

Journal: :TIIS 2011

Azzam Sleit Abdel Latif Abu Dalhoum Mohammad Qatawneh Maryam Al-Sharief Rawa'a Al-Jabaly Ola Karajeh

Content Based Image Retrieval (CBIR) is an approach for retrieving similar images from an image database based on automatically-derived image features. The quality of a retrieval system depends on the features used to describe image content. In this paper, we propose an image clustering system that takes a database of images as input and clusters them using kmeans clustering algorithm taking in...

متن کامل

Cross-entropy clustering

Journal: :Pattern Recognition 2014

Jacek Tabor Przemyslaw Spurek

We build a general and highly applicable clustering theory, which we call cross-entropy clustering (shortly CEC) which joins advantages of classical kmeans (easy implementation and speed) with those of EM (affine invariance and ability to adapt to clusters of desired shapes). Moreover, contrary to k-means and EM, CEC finds the optimal number of clusters by automatically removing groups which ca...

متن کامل

Ontology-based Text Document Clustering

2002

Steffen Staab Andreas Hotho

Text clustering typically involves clustering in a high dimensional space, which appears difficult with regard to virtually all practical settings. In addition, given a particular clustering result it is typically very hard to come up with a good explanation of why the text clusters have been constructed the way they are. In this paper, we propose a new approach for applying background knowledg...

متن کامل

A Parameterized Framework for Clustering Streams

2009

Vasudha Bhatnagar Sharanjit Kaur Laurent Mignet

Clustering of data streams finds important applications in tracking evolution of various phenomena in medical, meteorological, astrophysical, seismic studies. Algorithms designed for this purpose are capable of adapting the discovered clustering model to the changes in data characteristics but are not capable of adapting to the user’s requirements themselves. Based on the previous observation, ...

متن کامل

Fast, Linear Time Hierarchical Clustering using the Baire Metric

Journal: :J. Classification 2012

Pedro Contreras Fionn Murtagh

The Baire metric induces an ultrametric on a dataset and is of linear computational complexity, contrasted with the standard quadratic time agglomerative hierarchical clustering algorithm. In this work we evaluate empirically this new approach to hierarchical clustering. We compare hierarchical clustering based on the Baire metric with (i) agglomerative hierarchical clustering, in terms of algo...

متن کامل

Knowledge discovery from database Using an integration of clustering and classification

2011

Varun Kumar Nisha Rathee

Clustering and classification are two important techniques of data mining. Classification is a supervised learning problem of assigning an object to one of several pre-defined categories based upon the attributes of the object. While, clustering is an unsupervised learning problem that group objects based upon distance or similarity. Each group is known as a cluster. In this paper we make use o...

متن کامل

Comparing an Ant-Based Clustering Algorithm with Self- Organizing Maps and K-means

2012

Clodis Boscarioli Rosangela Villwock Bruno Eduardo Soares

The data analysis involves the performance of different tasks, which can be performed by many different techniques and strategies. The data clustering task, an unsupervised pattern recognition process, is the task of assigning a set of objects into groups called clusters so that the objects in the same cluster are more similar to each other than to those in other clusters. This paper describes ...

متن کامل

A Geospatial Implementation of a Novel Delineation Clustering Algorithm Employing the K-means

2008

Tonny J. Oyana Kara E. Scott

The overarching objective of this study is to report the implementation and performance of a novel delineation clustering algorithm employing the k-means. This study explores a newly proposed algorithm designed to increase the overall performance of the k-means clustering technique—the Fast, Efficient, and Scalable k-means algorithm (FES-kmeans*). The algorithm reduces the computational load an...

متن کامل

Research issues on K-means Algorithm: An Experimental Trial Using Matlab

2009

Joaquín Pérez Ortega Rocío Boone Rojas María J. Somodevilla

Clustering problems arise in many different applications: machine learning data mining and knowledge discovery, data compression and vector quantization, pattern recognition and pattern classification. It is considered that the k-means algorithm is the best-known squared errorbased clustering algorithm, is very simple and can be easily implemented in solving many practical problems. This paper ...

متن کامل