A Detailed Study of the Distributed Rough Set Based Locality Sensitive Hashing Feature Selection Technique
نویسندگان
چکیده
In the context of big data, granular computing has recently been implemented by some mathematical tools, especially Rough Set Theory (RST). As a key topic rough set theory, feature selection investigated to adapt related concepts RST deal with large amounts leading development distributed version. However, despite its scalability, version faces challenge tied partitioning search space in environment while guaranteeing data dependency. Therefore, this manuscript, we propose new based on Locality Sensitive Hashing (LSH), named LSH-dRST, for selection. LSH-dRST uses LSH match similar features into same bucket and maps generated buckets partitions enable splitting universe more efficient way. More precisely, paper, perform detailed analysis performance comparing it standard version, which is random universe. We demonstrate that our scalable when dealing data. also ensures high dimensional reliable way; hence better preserving dependency ensuring lower computational cost.
منابع مشابه
Locality-Sensitive Hashing with Margin Based Feature Selection
We propose a learning method with feature selection for Locality-Sensitive Hashing. Locality-Sensitive Hashing converts feature vectors into bit arrays. These bit arrays can be used to perform similarity searches and personal authentication. The proposed method uses bit arrays longer than those used in the end for similarity and other searches and by learning selects the bits that will be used....
متن کاملLocality Sensitive Hashing Based Clustering
Definition In learning systems with kernels, the shape and size of a kernel plays a critical role for accuracy and generalization. Most kernels have a distance metric parameter, which determines the size and shape of the kernel in the sense of a Mahalanobis distance. Advanced kernel learning tune every kernel’s distance metric individually, instead of turning one global distance metric for all ...
متن کاملAudio Feature Selection Based on Rough Set
Keeping audio features is important for audio index. However, in most cases the features number is huge, thus direct processing is time-consuming. Feature selection, as a preprocessing step of data mining, has turned to be very efficient in reducing dimensionality and removing irrelevant data. In this paper, we propose a feature selection algorithm based on Rough Set theory, which could find ou...
متن کاملBeyond Locality-Sensitive Hashing
We present a new data structure for the c-approximate near neighbor problem (ANN) in the Euclidean space. For n points in R, our algorithm achieves Oc(n + d logn) query time and Oc(n + d logn) space, where ρ ≤ 7/(8c2) + O(1/c3) + oc(1). This is the first improvement over the result by Andoni and Indyk (FOCS 2006) and the first data structure that bypasses a locality-sensitive hashing lower boun...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Fundamenta Informaticae
سال: 2021
ISSN: ['1875-8681', '0169-2968']
DOI: https://doi.org/10.3233/fi-2021-2069