نتایج جستجو برای: high dimensional clustering

تعداد نتایج: 2463052  

2007
Xiang Zhang Wei Wang Jun Huan

High throughput biotechnologies have enabled scientists to collect a large number of genetic and phenotypic attributes for a large collection of samples. Computational methods are in need to analyze these data for discovering genotype-phenotype associations and inferring possible phenotypes from genotypic attributes. In this paper, we study the problem of on demand phenotype ranking. Given a qu...

2014
Boyue Wang Yongli Hu Junbin Gao Yanfeng Sun Baocai Yin

Low-rank representation (LRR) has recently attracted great interest due to its pleasing efficacy in exploring low-dimensional subspace structures embedded in data. One of its successful applications is subspace clustering which means data are clustered according to the subspaces they belong to. In this paper, at a higher level, we intend to cluster subspaces into classes of subspaces. This is n...

2011
Michael Biehl Barbara Hammer Erzsébet Merényi Alessandro Sperduti Thomas Villmann

This report documents the program and the outcomes of Dagstuhl Seminar 11341 “Learning in the context of very high dimensional data”. The aim of the seminar was to bring together researchers who develop, investigate, or apply machine learning methods for very high dimensional data to advance this important field of research. The focus was be on broadly applicable methods and processing pipeline...

2007
Pando G. Georgiev Fabian J. Theis Anca L. Ralescu

We give general identifiability conditions on the source matrix in Blind Signal Separation problem. They refine some previously known ones. We develop a subspace clustering algorithm, which is a generalization of the k-plane clustering algorithm, and is suitable for separation of sparse mixtures with bigger sparsity (i.e. when the number of the sensors is bigger at least by 2 than the number of...

2017
Amir Globerson Roi Livni Shai Shalev-Shwartz

The abundance of unlabeled data makes semi-supervised learning (SSL) an attractive approach for improving the accuracy of learning systems. However, we are still far from a complete theoretical understanding of the benefits of this learning scenario in terms of sample complexity. In particular, for many natural learning settings it can in fact be shown that SSL does not improve sample complexit...

2015
Marian B. Gorzalczany Jakub Piekoszewski Filip Rudzinski

The paper presents the application of our clustering technique based on generalized self-organizing neural networks with evolving treelike structures to complex cluster-analysis problems including, in particular, the sample-based and gene-based clusterings of microarray Leukemia gene data set. Our approach works in a fully unsupervised way, i.e., without the necessity to predefine the number of...

2011
Xiaojun Chen Joshua Zhexue Huang

The data today is towards more observations and very high dimensions. Large high-dimensional data are usually sparse and contain many classes/clusters. For example, large text data in the vector space model often contains many classes of documents represented in thousands of terms. It has become a rule rather than the exception that clusters in high-dimensional data occur in subspaces of data, ...

2008
Robin Girard

High dimensional data analysis is known to be as a challenging problem (see [10]). In this article, we give a theoretical analysis of high dimensional classification of Gaussian data which relies on a geometrical analysis of the error measure. It links a problem of classification with a problem of nonparametric regression. We give an algorithm designed for high dimensional data which appears st...

2013
Woncheol Jang Johan Lim Ji Meng Loh Nicole Lazar

Identifying homogeneous subgroups of variables can be challenging in high dimensional data analysis with highly correlated predictors. The generalized fused lasso has been proposed to simultaneously select correlated variables and identify them as predictive clusters. In this article, we study several properties of generalized fused lasso. First, we present a geometric interpretation of the gen...

2008
Zijiang Yang Guojun Gan

Classification refers to a set of methods that predict the class of an object from attributes or features describing the object. In this paper we present a fuzzy classification algorithm to predict bankruptcy. Our classification algorithm is modified from a subspace clustering algorithm called fuzzy subspace clustering (FSC). As our algorithm associates each feature of a class with a fuzzy memb...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید