نتایج جستجو برای: high dimensional clustering
تعداد نتایج: 2463052 فیلتر نتایج به سال:
High throughput biotechnologies have enabled scientists to collect a large number of genetic and phenotypic attributes for a large collection of samples. Computational methods are in need to analyze these data for discovering genotype-phenotype associations and inferring possible phenotypes from genotypic attributes. In this paper, we study the problem of on demand phenotype ranking. Given a qu...
Low-rank representation (LRR) has recently attracted great interest due to its pleasing efficacy in exploring low-dimensional subspace structures embedded in data. One of its successful applications is subspace clustering which means data are clustered according to the subspaces they belong to. In this paper, at a higher level, we intend to cluster subspaces into classes of subspaces. This is n...
This report documents the program and the outcomes of Dagstuhl Seminar 11341 “Learning in the context of very high dimensional data”. The aim of the seminar was to bring together researchers who develop, investigate, or apply machine learning methods for very high dimensional data to advance this important field of research. The focus was be on broadly applicable methods and processing pipeline...
We give general identifiability conditions on the source matrix in Blind Signal Separation problem. They refine some previously known ones. We develop a subspace clustering algorithm, which is a generalization of the k-plane clustering algorithm, and is suitable for separation of sparse mixtures with bigger sparsity (i.e. when the number of the sensors is bigger at least by 2 than the number of...
The abundance of unlabeled data makes semi-supervised learning (SSL) an attractive approach for improving the accuracy of learning systems. However, we are still far from a complete theoretical understanding of the benefits of this learning scenario in terms of sample complexity. In particular, for many natural learning settings it can in fact be shown that SSL does not improve sample complexit...
The paper presents the application of our clustering technique based on generalized self-organizing neural networks with evolving treelike structures to complex cluster-analysis problems including, in particular, the sample-based and gene-based clusterings of microarray Leukemia gene data set. Our approach works in a fully unsupervised way, i.e., without the necessity to predefine the number of...
The data today is towards more observations and very high dimensions. Large high-dimensional data are usually sparse and contain many classes/clusters. For example, large text data in the vector space model often contains many classes of documents represented in thousands of terms. It has become a rule rather than the exception that clusters in high-dimensional data occur in subspaces of data, ...
High dimensional data analysis is known to be as a challenging problem (see [10]). In this article, we give a theoretical analysis of high dimensional classification of Gaussian data which relies on a geometrical analysis of the error measure. It links a problem of classification with a problem of nonparametric regression. We give an algorithm designed for high dimensional data which appears st...
Identifying homogeneous subgroups of variables can be challenging in high dimensional data analysis with highly correlated predictors. The generalized fused lasso has been proposed to simultaneously select correlated variables and identify them as predictive clusters. In this article, we study several properties of generalized fused lasso. First, we present a geometric interpretation of the gen...
Classification refers to a set of methods that predict the class of an object from attributes or features describing the object. In this paper we present a fuzzy classification algorithm to predict bankruptcy. Our classification algorithm is modified from a subspace clustering algorithm called fuzzy subspace clustering (FSC). As our algorithm associates each feature of a class with a fuzzy memb...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید