Random subspace and random projection nearest neighbor ensembles for high dimensional data
نویسندگان
چکیده
The random subspace and the projection methods are investigated compared as techniques for forming ensembles of nearest neighbor classifiers in high dimensional feature spaces. two have been empirically evaluated on three types high-dimensional datasets: microarrays, chemoinformatics, images. Experimental results 34 datasets show that both method lead to improvements predictive performance using standard classifier, while best use depends type data considered; microarray chemoinformatics datasets, outperforms method, opposite holds image datasets. An analysis complexity measures, such attribute instance ratio Fisher’s discriminant ratio, provide some more detailed indications what relative can be expected specific also indicate resulting may competitive with state-of-the-art ensemble classifiers; perform par forests
منابع مشابه
Random Projection-Based Anderson-Darling Test for Random Fields
In this paper, we present the Anderson-Darling (AD) and Kolmogorov-Smirnov (KS) goodness of fit statistics for stationary and non-stationary random fields. Namely, we adopt an easy-to-apply method based on a random projection of a Hilbert-valued random field onto the real line R, and then, applying the well-known AD and KS goodness of fit tests. We conclude this paper by studying the behavior o...
متن کاملWeighted random subspace method for high dimensional data classification.
High dimensional data, especially those emerging from genomics and proteomics studies, pose significant challenges to traditional classification algorithms because the performance of these algorithms may substantially deteriorate due to high dimensionality and existence of many noisy features in these data. To address these problems, pre-classification feature selection and aggregating algorith...
متن کاملImage recognition via two-dimensional random projection and nearest constrained subspace
We consider the problem of image recognition via two-dimensional random projection and nearest constrained subspace. First, image features are extracted by a two-dimensional random projection. The two-dimensional random projection for feature extraction is an extension of the 1D compressive sampling technique to 2D and is computationally more efficient than its 1D counterpart and 2D reconstruct...
متن کاملClassification by ensembles from random partitions of high-dimensional data
A robust classification procedure is developed based on ensembles of classifiers, with each classifier constructed from a different set of predictors determined by a random partition of the entire set of predictors. The proposed methods combine the results of multiple classifiers to achieve a substantially improved prediction compared to the optimal single classifier. This approach is designed ...
متن کاملFast Parallel Estimation of High Dimensional Information Theoretical Quantities with Low Dimensional Random Projection Ensembles
Goal: estimation of high dimensional information theoretical quantities (entropy, mutual information, divergence). • Problem: computation/estimation is quite slow. • Consistent estimation is possible by nearest neighbor (NN) methods [1] → pairwise distances of sample points: – expensive in high dimensions [2], – approximate isometric embedding into low dimension is possible (Johnson-Lindenstrau...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Expert Systems With Applications
سال: 2022
ISSN: ['1873-6793', '0957-4174']
DOI: https://doi.org/10.1016/j.eswa.2021.116078