clustering validity

A Prediction-Based Visual Approach for Cluster Exploration and Cluster Validation by HOV3

2007

Ke-Bing Zhang Mehmet A. Orgun Kang Zhang

Predictive knowledge discovery is an important knowledge acquisition method. It is also used in the clustering process of data mining. Visualization is very helpful for high dimensional data analysis, but not precise and this limits its usability in quantitative cluster analysis. In this paper, we adopt a visual technique called HOV to explore and verify clustering results with quantified measu...

متن کامل

Clustering techniques in colour image segmentation

2003

Henryk Palus

In this paper, five clustering techniques (k-means, ISODATA, merging, splitting and mean shift techniques) used for colour image segmentation are presented. Two heuristic evaluation methods (cluster validity measure VM and quality function Q) are applied. We show that evaluation functions VM and Q can be very helpful in search of best segmentation results. The best results came from k-means, me...

متن کامل

Computer Vision-based Color Image Segmentation with Improved Kernel Clustering

2015

Yongqing Wang Chunxiang Wang

Color image segmentation has been widely applied to diverse fields in the past decades for containing more information than gray ones, whose essence is a process of clustering according to the color of pixels. However, traditional clustering methods do not scale well with the number of data, which limits the ability of handling massive data effectively. We developed an improved kernel clusterin...

متن کامل

Biological Cluster Validity Indices Based on the Gene Ontology

2005

Nora Speer Christian Spieth Andreas Zell

With the invention of biotechnological high throughput methods like DNA microarrays and the analysis of the resulting huge amounts of biological data, clustering algorithms gain new popularity. In practice the question arises, which clustering algorithm as well as which parameter set generates the most promising results. Little work is addressed to the question of evaluating and comparing the c...

متن کامل

An Improved Algorithm for Segregating Large Geospatial Data

2006

Kara E. Scott Tonny J. Oyana

This study investigates an improved k-means clustering algorithm for segregating large geospatial data. Although the conventional k-means method is sufficient for datasets with minimal data, it does not perform well and, therefore yields poor accuracy for high-volume datasets. Clustering methods are one of the most important components in data classification, visualization, and mining highvolum...

متن کامل

Clustering algorithms used in 3D scene segmentation

2013

Isma Hadji Daniel Nabelek

In this paper, we implement and compare three different clustering algorithms for the purpose of 3D image segmentation. Specifically, the K-means, Mean Shift, and Hierarchical methods are studied, and their performance is compared using cluster validity methods. Performance was analyzed in two ways, first by comparing independent results from each, and second, by comparing results where Hierarc...

متن کامل

Graph-Based Hierarchical Conceptual Clustering

Journal: :International Journal on Artificial Intelligence Tools 2000

Istvan Jonyer Lawrence B. Holder Diane J. Cook

Hierarchical conceptual clustering has proven to be a useful, although under-explored, data mining technique. A graph-based representation of structural information combined with a substructure discovery technique has been shown to be successful in knowledge discovery. The SUBDUE substructure discovery system provides one such combination of approaches. This work presents SUBDUE and the develop...

متن کامل

Gene Expression Data Clustering using a Fuzzy Link based Approach

2012

Rosy Sarmah

There are many clustering algorithms for gene expression data in the literature that are robust against noise and outliers. The limitation with many of these algorithms is that they cannot identify the overlapping and intersecting clusters. This paper presents an algorithm for clustering gene expression data using the concepts of common neighbors and fuzzy clustering for detecting intersecting ...

متن کامل

An Improved Algorithm of Rough K-Means Clustering Based on Variable Weighted Distance Measure

2014

Tengfei Zhang Long Chen Fumin Ma

Rough K-means algorithm has shown that it can provides a reasonable set of lower and upper bounds for a given dataset. With the conceptions of the lower and upper approximate sets, rough k-means clustering and its emerging derivatives become valid algorithms in vague information clustering. However, the most available algorithms ignore the difference of the distances between data objects and cl...

متن کامل

Cluster Validity Through Graph-based Boundary Analysis

2004

Jianhua Yang Ickjai Lee

Gaining confidence that a clustering algorithm has produced meaningful results and not an accident of its usually heuristic optimization is central to data mining. This is the issue of cluster validity. We propose here a method by which proximity graphs are used to effectively detect border points and measure the margin between clusters. With analysis of boundary situation, we design a framewor...

متن کامل