high dimensional clustering

نتایج جستجو برای: high dimensional clustering

تعداد نتایج: 2463052 فیلتر نتایج به سال:

Clustering High-Dimensional Data Using Evidence of Multimodality

2010

Peter Hall

We suggest a nonparametric approach to clustering very high-dimensional data, designed particularly for problems where the mixture nature of a population is expressed through multimodality of its density. In such cases a technique based implicitly on mode-testing can be particularly effective. In principle, several alternative approaches could be used to assess the extent of multimodality, but ...

متن کامل

A Prediction-Based Visual Approach for Cluster Exploration and Cluster Validation by HOV3

2007

Ke-Bing Zhang Mehmet A. Orgun Kang Zhang

Predictive knowledge discovery is an important knowledge acquisition method. It is also used in the clustering process of data mining. Visualization is very helpful for high dimensional data analysis, but not precise and this limits its usability in quantitative cluster analysis. In this paper, we adopt a visual technique called HOV to explore and verify clustering results with quantified measu...

متن کامل

CoDA: Interactive Cluster Based Concept Discovery

Journal: :PVLDB 2010

Stephan Günnemann Ines Färber Hardy Kremer Thomas Seidl

Large data resources are ubiquitous in science and business. For these domains, an intuitive view on the data is essential to fully exploit the hidden knowledge. Often, these data can be semantically structured by concepts. Since the determination of concepts requires a thorough analysis of the data, data mining methods have to be applied. In the field of subspace clustering, some techniques ha...

متن کامل

Clustering Student Learning Activity Data

2010

Haiyun Bian

We show a variety of ways to cluster student activity datasets using different clustering and subspace clustering algorithms. Our results suggest that each algorithm has its own strength and weakness, and can be used to find clusters of different properties. 1 Background Introduction Many education datasets are by nature high dimensional. Finding coherent and compact clusters becomes difficult ...

متن کامل

Robust Localized Multi-view Subspace Clustering

Journal: :CoRR 2017

Yanbo Fan Jian Liang Ran He Bao-Gang Hu Siwei Lyu

In multi-view clustering, different views may have different confidence levels when learning a consensus representation. Existing methods usually address this by assigning distinctive weights to different views. However, due to noisy nature of realworld applications, the confidence levels of samples in the same viewmay also vary. Thus considering a unified weight for a view may lead to suboptim...

متن کامل

Subspace Clustering using Ensembles of $K$-Subspaces

Journal: :CoRR 2017

John Lipor David Hong Dejiao Zhang Laura Balzano

We present a novel approach to the subspace clustering problem that leverages ensembles of the K-subspaces (KSS) algorithm via the evidence accumulation clustering framework. Our algorithm forms a co-association matrix whose (i, j)th entry is the number of times points i and j are clustered together by several runs of KSS with random initializations. We analyze the entries of this co-associatio...

متن کامل

Rate-optimal Posterior Contraction for Sparse Pca

2013

Chao Gao Harrison H. Zhou

Principal component analysis (PCA) is possibly one of the most widely used statistical tools to recover a low rank structure of the data. In the high-dimensional settings, the leading eigenvector of the sample covariance can be nearly orthogonal to the true eigenvector. A sparse structure is then commonly assumed along with a low rank structure. Recently, minimax estimation rates of sparse PCA ...

متن کامل

Constrained Subspace Clustering for Time Series Gene Expression Data

2010

Jibin Qu Michael Ng Luonan Chen

For time series gene expression data, it is an important problem to find subgroups of genes with similar expression pattern in a consecutive time window. In this paper, we extend a fuzzy c-means clustering algorithm to construct two models to detect biclusters respectively, i.e., constant value biclusters and similarity-based biclusters whose gene expression profiles are similar within consecut...

متن کامل

CUR Decompositions, Similarity Matrices, and Subspace Clustering

Journal: :CoRR 2017

Akram Aldroubi Keaton Hamm Ahmet Bugra Koku Ali Sekmen

A general framework for solving the subspace clustering problem using the CUR decomposition is presented. The CUR decomposition provides a natural way to construct similarity matrices for data that come from a union of unknown subspaces U = M ⋃ i=1 Si. The similarity matrices thus constructed give the exact clustering in the noise-free case. A simple adaptation of the technique also allows clus...

متن کامل

Finding Hierarchies of Subspace Clusters

2006

Elke Achtert Christian Böhm Hans-Peter Kriegel Peer Kröger Ina Müller-Gorman Arthur Zimek

Many clustering algorithms are not applicable to high-dimensional feature spaces, because the clusters often exist only in specific subspaces of the original feature space. Those clusters are also called subspace clusters. In this paper, we propose the algorithm HiSC (Hierarchical Subspace Clustering) that can detect hierarchies of nested subspace clusters, i.e. the relationships of lowerdimens...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید