high dimensional clustering

نتایج جستجو برای: high dimensional clustering

تعداد نتایج: 2463052 فیلتر نتایج به سال:

On Demand Phenotype Ranking through Subspace Clustering

2007

Xiang Zhang Wei Wang Jun Huan

High throughput biotechnologies have enabled scientists to collect a large number of genetic and phenotypic attributes for a large collection of samples. Computational methods are in need to analyze these data for discovering genotype-phenotype associations and inferring possible phenotypes from genotypic attributes. In this paper, we study the problem of on demand phenotype ranking. Given a qu...

متن کامل

Low Rank Representation on Grassmann Manifolds

2014

Boyue Wang Yongli Hu Junbin Gao Yanfeng Sun Baocai Yin

Low-rank representation (LRR) has recently attracted great interest due to its pleasing efficacy in exploring low-dimensional subspace structures embedded in data. One of its successful applications is subspace clustering which means data are clustered according to the subspaces they belong to. In this paper, at a higher level, we intend to cluster subspaces into classes of subspaces. This is n...

متن کامل

Learning in the context of very high dimensional data

2011

Michael Biehl Barbara Hammer Erzsébet Merényi Alessandro Sperduti Thomas Villmann

This report documents the program and the outcomes of Dagstuhl Seminar 11341 “Learning in the context of very high dimensional data”. The aim of the seminar was to bring together researchers who develop, investigate, or apply machine learning methods for very high dimensional data to advance this important field of research. The focus was be on broadly applicable methods and processing pipeline...

متن کامل

Identifiability Conditions and Subspace Clustering in Sparse BSS

2007

Pando G. Georgiev Fabian J. Theis Anca L. Ralescu

We give general identifiability conditions on the source matrix in Blind Signal Separation problem. They refine some previously known ones. We develop a subspace clustering algorithm, which is a generalization of the k-plane clustering algorithm, and is suitable for separation of sparse mixtures with bigger sparsity (i.e. when the number of the sensors is bigger at least by 2 than the number of...

متن کامل

Effective Semisupervised Learning on Manifolds

2017

Amir Globerson Roi Livni Shai Shalev-Shwartz

The abundance of unlabeled data makes semi-supervised learning (SSL) an attractive approach for improving the accuracy of learning systems. However, we are still far from a complete theoretical understanding of the benefits of this learning scenario in terms of sample complexity. In particular, for many natural learning settings it can in fact be shown that SSL does not improve sample complexit...

متن کامل

Microarray Leukemia Gene Data Clustering by Means of Generalized Self-organizing Neural Networks with Evolving Tree-Like Structures

2015

Marian B. Gorzalczany Jakub Piekoszewski Filip Rudzinski

The paper presents the application of our clustering technique based on generalized self-organizing neural networks with evolving treelike structures to complex cluster-analysis problems including, in particular, the sample-based and gene-based clusterings of microarray Leukemia gene data set. Our approach works in a fully unsupervised way, i.e., without the necessity to predefine the number of...

متن کامل

Multi-view Subspace Clustering for High-dimensional Data

2011

Xiaojun Chen Joshua Zhexue Huang

The data today is towards more observations and very high dimensions. Large high-dimensional data are usually sparse and contain many classes/clusters. For example, large text data in the vector space model often contains many classes of documents represented in thousands of terms. It has become a rule rather than the exception that clusters in high-dimensional data occur in subspaces of data, ...

متن کامل

High dimensional gaussian classification

2008

Robin Girard

High dimensional data analysis is known to be as a challenging problem (see [10]). In this article, we give a theoretical analysis of high dimensional classification of Gaussian data which relies on a geometrical analysis of the error measure. It links a problem of classification with a problem of nonparametric regression. We give an algorithm designed for high dimensional data which appears st...

متن کامل

Some Properties of Generalized Fused Lasso and Its Applications to High Dimensional Data

2013

Woncheol Jang Johan Lim Ji Meng Loh Nicole Lazar

Identifying homogeneous subgroups of variables can be challenging in high dimensional data analysis with highly correlated predictors. The generalized fused lasso has been proposed to simultaneously select correlated variables and identify them as predictive clusters. In this article, we study several properties of generalized fused lasso. First, we present a geometric interpretation of the gen...

متن کامل

Application of Fuzzy Classification in Bankruptcy Prediction

2008

Zijiang Yang Guojun Gan

Classification refers to a set of methods that predict the class of an object from attributes or features describing the object. In this paper we present a fuzzy classification algorithm to predict bankruptcy. Our classification algorithm is modified from a subspace clustering algorithm called fuzzy subspace clustering (FSC). As our algorithm associates each feature of a class with a fuzzy memb...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید