high dimensional clustering

Visual Exploration of High-Dimensional Data through Subspace Analysis and Dynamic Projections

Journal: :Comput. Graph. Forum 2015

Shusen Liu Bei Wang Jayaraman J. Thiagarajan Peer-Timo Bremer Valerio Pascucci

We introduce a novel interactive framework for visualizing and exploring high-dimensional datasets based on subspace analysis and dynamic projections. We assume the high-dimensional dataset can be represented by a mixture of low-dimensional linear subspaces with mixed dimensions, and provide a method to reliably estimate the intrinsic dimension and linear basis of each subspace extracted from t...

متن کامل

Geometric Conditions for Subspace-Sparse Recovery

2015

Chong You René Vidal

Given a dictionary Π and a signal ξ = Πx generated by a few linearly independent columns of Π, classical sparse recovery theory deals with the problem of uniquely recovering the sparse representation x of ξ. In this work, we consider the more general case where ξ lies in a lowdimensional subspace spanned by a few columns of Π, which are possibly linearly dependent. In this case, x may not uniqu...

متن کامل

ELKI: A Software System for Evaluation of Subspace Clustering Algorithms

2008

Elke Achtert Hans-Peter Kriegel Arthur Zimek

In order to establish consolidated standards in novel data mining areas, newly proposed algorithms need to be evaluated thoroughly. Many publications compare a new proposition – if at all – with one or two competitors or even with a so called “näıve” ad hoc solution. For the prolific field of subspace clustering, we propose a software framework implementing many prominent algorithms and, thus, ...

متن کامل

Guess Who Rated This Movie: Identifying Users Through Subspace Clustering

2012

Amy Zhang Nadia Fawaz Stratis Ioannidis Andrea Montanari

It is often the case that, within an online recommender system, multiple users share a common account. Can such shared accounts be identified solely on the basis of the userprovided ratings? Once a shared account is identified, can the different users sharing it be identified as well? Whenever such user identification is feasible, it opens the way to possible improvements in personalized recomm...

متن کامل

Effective Subspace Clustering with Dimension Pairing in the Presence of High Levels of Noise

2006

Andrew Foss Osmar R. Zaïane

Attempts at clustering large and high dimensional data have been made with a focus on scalability. While still inefficient for more complex problems, the effectiveness is also questionable because data becomes very sparse in a high dimensional space. If clusters exist in the data, they tend to remain hidden in some unidentified sub-spaces. So far, the few solutions to this problem have not been...

متن کامل

Subspace Clustering of Skill Mastery: Identifying Skills that Separate Students

2009

Rebecca Nugent Elizabeth Ayers Nema Dean

In educational research, a fundamental goal is identifying which skills students have mastered, which skills they have not, and which skills they are in the process of mastering. As the number of examinees, items, and skills increases, the estimation of even simple cognitive diagnosis models becomes difficult. We adopt a faster, simpler approach: cluster a capability matrix estimating each stud...

متن کامل

Revisiting Perceptually Optimized Color Mapping for High-Dimensional Data Analysis

2014

Sebastian Mittelstädt Jürgen Bernard Tobias Schreck Martin Steiger Jörn Kohlhammer Daniel A. Keim

Color is one of the most effective visual variables since it can be combined with other mappings and encode information without using any additional space on the display. An important example where expressing additional visual dimensions is direly needed is the analysis of high-dimensional data. The property of perceptual linearity is desirable in this application, because the user intuitively ...

متن کامل

Subspace clustering using affinity propagation

Journal: :Pattern Recognition 2015

Guojun Gan Michael K. Ng

This paper proposes a subspace clustering algorithm by introducing attribute weights in the affinity propagation algorithm. A new step is introduced to the affinity propagation process to iteratively update the attribute weights based on the current partition of the data. The relative magnitude of the attribute weights can be used to identify the subspaces in which clusters are embedded. Experi...

متن کامل

Pleiades: Subspace Clustering and Evaluation

2008

Ira Assent Emmanuel Müller Ralph Krieger Timm Jansen Thomas Seidl

Subspace clustering mines the clusters present in locally relevant subsets of the attributes. In the literature, several approaches have been suggested along with different measures for quality assessment. Pleiades provides the means for easy comparison and evaluation of different subspace clustering approaches, along with several quality measures specific for subspace clustering as well as ext...

متن کامل

Interactive High-Dimensional Data Analysis Using The "Three Experts"

2016

Georg Albrecht Alex T. Pang

With the increasing availability of data from various domains such as health care, finance, social networks, etc. there is a need to provide analytic tools that are more accessible to lay people. In this paper, we present a software tool which can be used to aid inexperienced users in understanding high dimensional data. To facilitate the understanding of such data, we place special emphasis on...

متن کامل