نتایج جستجو برای: high dimensional data

تعداد نتایج: 4272118  

2016
Daniel L. Pimentel-Alarcón Robert D. Nowak

Subspace clustering with missing data (SCMD) is a useful tool for analyzing incomplete datasets. Let d be the ambient dimension, and r the dimension of the subspaces. Existing theory shows that Nk = O(rd) columns per subspace are necessary for SCMD, andNk = O(min{d , d}) are sufficient. We close this gap, showing that Nk = O(rd) is also sufficient. To do this we derive deterministic sampling co...

2004
Carlotta Domeniconi Dimitris Papadopoulos Dimitrios Gunopulos Sheng Ma

Clustering suffers from the curse of dimensionality, and similarity functions that use all input features with equal relevance may not be effective. We introduce an algorithm that discovers clusters in subspaces spanned by different combinations of dimensions via local weightings of features. This approach avoids the risk of loss of information encountered in global dimensionality reduction tec...

2013
Yariv Aizenbud Amit Bermanis Amir Averbuch

Dimensionality reduction methods are very common in the field of high dimensional data analysis, where the classical analysis methods are inadequate. Typically, algorithms for dimensionality reduction are computationally expensive. Therefore, their applications to process data warehouses are impractical. It is visible even more when the data is accumulated non-stop. In this paper, an out-of-sam...

Journal: :CoRR 2015
Yifan Fu Junbin Gao Xia Hong David Tien

In this paper, we present a novel low rank representation (LRR) algorithm for data lying on the manifold of square root densities. Unlike traditional LRR methods which rely on the assumption that the data points are vectors in the Euclidean space, our new algorithm is designed to incorporate the intrinsic geometric structure and geodesic distance of the manifold. Experiments on several computer...

2003
P. Deepa Shenoy K. G. Srinivasa M. P. Mithun K. R. Venugopal Lalit M. Patnaik

Emerging high-dimensional data mining applications needs to find interesting clusters embeded in arbitrarily aligned subspaces of lower dimensionality. It is difficult to cluster high-dimensional data objects, when they are sparse and skewed. Updations are quite common in dynamic databases and they are usually processed in batch mode. In very large dynamic databases, it is necessary to perform ...

Journal: :CoRR 2017
Stephen Tierney Yi Guo Junbin Gao

Sparse Subspace Clustering (SSC) has been used extensively for subspace iden-tification tasks due to its theoretical guarantees and relative ease of implemen-tation. However SSC has quadratic computation and memory requirementswith respect to the number of input data points. This burden has prohibitedSSCs use for all but the smallest datasets. To overcome this we propose a n...

2015
Congyuan Yang Daniel P. Robinson René Vidal

We consider the problem of clustering incomplete data drawn from a union of subspaces. Classical subspace clustering methods are not applicable to this problem because the data are incomplete, while classical low-rank matrix completion methods may not be applicable because data in multiple subspaces may not be low rank. This paper proposes and evaluates two new approaches for subspace clusterin...

2008
Elke Achtert Hans-Peter Kriegel Arthur Zimek

In order to establish consolidated standards in novel data mining areas, newly proposed algorithms need to be evaluated thoroughly. Many publications compare a new proposition – if at all – with one or two competitors or even with a so called “näıve” ad hoc solution. For the prolific field of subspace clustering, we propose a software framework implementing many prominent algorithms and, thus, ...

Journal: :IEEE Trans. Geoscience and Remote Sensing 1993
Chulhee Lee David A. Landgrebe

In this paper, through a series of specific examples, we illustrate some characteristics encountered in analyzing high dimensional multispectral data. The increased importance of the second order statistics in analyzing high dimensional data is illustrated, as is the shortcoming of classifiers such as the minimum distance classifier which rely on first order variations alone. We also illustrate...

2000
Mario A. Lopez Swanwa Liao

We present a novel approach to report approximate as well as exact k-closest pairs for sets of high dimensional points, under the L t-metric, t = 1; : : : ; 1. The proposed algorithms are eecient and simple to implement. They all use multiple shifted copies of the data points sorted according to their position along a space lling curve, such as the Peano curve, in a way that allows us to make p...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید