Partition Affinity Propagation for Clustering Large Scale of Data in Digital Library
نویسندگان
چکیده
Data clustering is very useful in helping users visit the large scale of data in digit library. In this paper, we present an improved algorithm for clustering large scale of data set with dense relationship based on Affinity Propagation. First, the input data are divided into several groups and Affinity Propagation is applied to them respectively. Results from first step are grouped together in some way, and Affinity Propagation is implemented to them. Experimental results show that our algorithm, referred to as Partition Affinity Propagation, brings an encouraging effect for speeding up Affinity Propagation in clustering dense data set, while clustering accuracy are almost kept or even better. Index Terms — Algorithms, Affinity Propagation, Clustering methods, Dense Data, Experimentation, Performance.
منابع مشابه
A partition-based algorithm for clustering large-scale software systems
Clustering techniques are used to extract the structure of software for understanding, maintaining, and refactoring. In the literature, most of the proposed approaches for software clustering are divided into hierarchical algorithms and search-based techniques. In the former, clustering is a process of merging (splitting) similar (non-similar) clusters. These techniques suffered from the drawba...
متن کاملLocal and global approaches of affinity propagation clustering for large scale data
Recently a new clustering algorithm called ‘affinity propagation’ (AP) has been proposed, which efficiently clustered sparsely related data by passing messages between data points. However, we want to cluster large scale data where the similarities are not sparse in many cases. This paper presents two variants of AP for grouping large scale data with a dense similarity matrix. The local approac...
متن کاملParallel Clustering Algorithm for Large-Scale Biological Data Sets
BACKGROUNDS Recent explosion of biological data brings a great challenge for the traditional clustering algorithms. With increasing scale of data sets, much larger memory and longer runtime are required for the cluster identification problems. The affinity propagation algorithm outperforms many other classical clustering algorithms and is widely applied into the biological researches. However, ...
متن کاملEvaluation of Updating Methods in Building Blocks Dataset
With the increasing use of spatial data in daily life, the production of this data from diverse information sources with different precision and scales has grown widely. Generating new data requires a great deal of time and money. Therefore, one solution is to reduce costs is to update the old data at different scales using new data (produced on a similar scale). One approach to updating data i...
متن کاملClustering Large-Scale Data Based On Modified Affinity Propagation Algorithm
Traditional clustering algorithms are no longer suitable for use in data mining applications that make use of large-scale data. There have been many large-scale data clustering algorithms proposed in recent years, but most of them do not achieve clustering with high quality. Despite that Affinity Propagation (AP) is effective and accurate in normal data clustering, but it is not effective for l...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007