high dimensional data

Combinando semi-supervisão e hubness para aprimorar o agrupamento de dados em alta dimensão

2016

Mateus C. de Lima Maria Camila Nardini Barioni Humberto Luiz Razente

The curse of dimensionality turns the high-dimensional data analysis a challenging task for data clustering techniques. In order to deal with highdimensional data, this paper presents a clustering approach that explores the combination of two strategies: semi-supervision and density estimation based on hubness scores. Initial experimental results show a good performance when applied on real dat...

متن کامل

Explaining Ant-Based Clustering on the basis of Self-Organizing Maps

2008

Lutz Herrmann Alfred Ultsch

Ant-based clustering is a nature-inspired technique whereas stochastic agents perform the task of clustering high-dimensional data. This paper analyzes the popular technique of Lumer/Faieta. It is shown that the Lumer/Faieta approach is strongly related to Kohonen’s SelfOrganizing Batch Map. A unifying basis is derived in order to assess strengths and weaknesses of both techniques. The behaviou...

متن کامل

Effective Evaluation Measures for Subspace Clustering of Data Streams

2013

Marwan Hassani Yunsu Kim Seungjin Choi Thomas Seidl

Nowadays, most streaming data sources are becoming highdimensional. Accordingly, subspace stream clustering, which aims at finding evolving clusters within subgroups of dimensions, has gained a significant importance. However, existing subspace clustering evaluation measures are mainly designed for static data, and cannot reflect the quality of the evolving nature of data streams. On the other ...

متن کامل

Collaborative Low-Rank Subspace Clustering

Journal: :CoRR 2017

Stephen Tierney Yi Guo Junbin Gao

In this paper we present Collaborative Low-Rank Subspace Clustering. Given multiple observations of a phenomenon we learn a unified representation matrix. This unified matrix incorporates the features from all the observations, thus increasing the discriminative power compared with learning the representation matrix on each observation separately. Experimental evaluation shows that our method o...

متن کامل

Hyperdimensional Data Analysis Using Parallel Coordinates1

1990

Edward J. Wegman

This paper presents the basic results for using the parallel coordinate representation as a high dimensional data analysis tool. Several alternatives are reviewed. The basic algorithm for parallel coordinates is laid out and a discussion of its properties as a projective transformation are shown. The several of the duality results are discussed along with their interpretations as data analysis ...

متن کامل

Recognizing Human-Object Interactions Using Sparse Subspace Clustering

2013

Ivan Bogun Eraldo Ribeiro

This is a difficult because: 1. Object appearance varies 2. Way of interacting with object vary among people 3. Both object and body parts of the interest might be occluded Contributions By using motion information alone, we propose an unsupervised framework for clustering and classifying videos of people interacting with objects. The method is based on [2]. We show that: 1. human-object intera...

متن کامل

Hierarchical Density-Based Clustering of Categorical Data and a Simplification

2007

Bill Andreopoulos Aijun An Xiaogang Wang

A challenge involved in applying density-based clustering to categorical datasets is that the ‘cube’ of attribute values has no ordering defined. We propose the HIERDENC algorithm for hierarchical densitybased clustering of categorical data. HIERDENC offers a basis for designing simpler clustering algorithms that balance the tradeoff of accuracy and speed. The characteristics of HIERDENC includ...

متن کامل

Detection and Visualization of Subspace Cluster Hierarchies

2007

Elke Achtert Christian Böhm Hans-Peter Kriegel Peer Kröger Ina Müller-Gorman Arthur Zimek

Subspace clustering (also called projected clustering) addresses the problem that different sets of attributes may be relevant for different clusters in high dimensional feature spaces. In this paper, we propose the algorithm DiSH (Detecting Subspace cluster Hierarchies) that improves in the following points over existing approaches: First, DiSH can detect clusters in subspaces of significantly...

متن کامل

A convergence theorem for the fuzzy subspace clustering (FSC) algorithm

Journal: :Pattern Recognition 2008

Guojun Gan Jianhong Wu

We establish the convergence of the fuzzy subspace clustering (FSC) algorithm by applying Zangwill’s convergence theorem. We show that the iteration sequence produced by the FSC algorithm terminates at a point in the solution set S or there is a subsequence converging to a point in S. In addition, we present experimental results that illustrate the convergence properties of the FSC algorithm in...

متن کامل

Cartification: From Similarities to Itemset Frequencies

2012

Bart Goethals

We propose a transformation method to circumvent the problems with high dimensional data. For each object in the data, we create an itemset of the k-nearest neighbors of that object, not just for one of the dimensions, but for many views of the data. On the resulting collection of sets, we can mine frequent itemsets; that is, sets of points that are frequently seen together in some of the views...

متن کامل