A Probabilistic Spell for the Curse of Dimensionality
نویسندگان
چکیده
Range searches in metric spaces can be very diicult if the space is \high dimensional", i.e. when the histogram of distances has a large mean and/or a small variance. This so-called \curse of dimen-sionality", well known in vector spaces, is also observed in metric spaces. There are at least two reasons behind the curse of dimensionality: a large search radius and/or a high intrinsic dimension of the metric space. We present a general probabilistic framework based on stretching the triangle inequality, whose direct eeect is a reduction of the eeective search radius. The technique gets more eeective as the dimension grows, and the basic principle can be applied to any search algorithm. In this paper we apply it to a particular class of indexing algorithms. We present an analysis which helps understand the process, as well as empirical evidence showing dramatic improvements in the search time at the cost of a very small error probability.
منابع مشابه
Probabilistic Neural Network Models for Sequential Data
It has already been shown how Artificial Neural Networks (ANNs) can be incorporated into probabilistic models. In this paper we review some of the approaches which have been proposed to incorporate them into probabilistic models of sequential data, such as Hidden Markov Models (HMMs). We also discuss new developments and new ideas in this area, in particular how ANNs can be used to model high-d...
متن کاملمدل ترکیبی تحلیل مؤلفه اصلی احتمالاتی بانظارت در چارچوب کاهش بعد بدون اتلاف برای شناسایی چهره
In this paper, we first proposed the supervised version of probabilistic principal component analysis mixture model. Then, we consider a learning predictive model with projection penalties, as an approach for dimensionality reduction without loss of information for face recognition. In the proposed method, first a local linear underlying manifold of data samples is obtained using the supervised...
متن کاملCritical Evaluation of Linear Dimensionality Reduction Techniques for Cardiac Arrhythmia Classification
Embedding the original high dimensional data in a low dimensional space helps to overcome the curse of dimensionality and removes noise. The aim of this work is to evaluate the performance of three different linear dimensionality reduction techniques (DR) techniques namely principal component analysis (PCA), multi dimensional scaling (MDS) and linear discriminant analysis (LDA) on classificatio...
متن کاملQuick Training of Probabilistic Neural Nets by Importance Sampling
Our previous work on statistical language modeling introduced the use of probabilistic feedforward neural networks to help dealing with the curse of dimensionality. Training this model by maximum likelihood however requires for each example to perform as many network passes as there are words in the vocabulary. Inspired by the contrastive divergence model, we propose and evaluate sampling-based...
متن کاملMixtures of Gaussian Distributions under Linear Dimensionality Reduction
High dimensional spaces pose a serious challenge to the learning process. It is a combination of limited number of samples and high dimensions that positions many problems under the “curse of dimensionality”, which restricts severely the practical application of density estimation. Many techniques have been proposed in the past to discover embedded, locally-linear manifolds of lower dimensional...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001