A Detailed Study of the Distributed Rough Set Based Locality Sensitive Hashing Feature Selection Technique

نویسندگان

چکیده

In the context of big data, granular computing has recently been implemented by some mathematical tools, especially Rough Set Theory (RST). As a key topic rough set theory, feature selection investigated to adapt related concepts RST deal with large amounts leading development distributed version. However, despite its scalability, version faces challenge tied partitioning search space in environment while guaranteeing data dependency. Therefore, this manuscript, we propose new based on Locality Sensitive Hashing (LSH), named LSH-dRST, for selection. LSH-dRST uses LSH match similar features into same bucket and maps generated buckets partitions enable splitting universe more efficient way. More precisely, paper, perform detailed analysis performance comparing it standard version, which is random universe. We demonstrate that our scalable when dealing data. also ensures high dimensional reliable way; hence better preserving dependency ensuring lower computational cost.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Locality-Sensitive Hashing with Margin Based Feature Selection

We propose a learning method with feature selection for Locality-Sensitive Hashing. Locality-Sensitive Hashing converts feature vectors into bit arrays. These bit arrays can be used to perform similarity searches and personal authentication. The proposed method uses bit arrays longer than those used in the end for similarity and other searches and by learning selects the bits that will be used....

متن کامل

Locality Sensitive Hashing Based Clustering

Definition In learning systems with kernels, the shape and size of a kernel plays a critical role for accuracy and generalization. Most kernels have a distance metric parameter, which determines the size and shape of the kernel in the sense of a Mahalanobis distance. Advanced kernel learning tune every kernel’s distance metric individually, instead of turning one global distance metric for all ...

متن کامل

Audio Feature Selection Based on Rough Set

Keeping audio features is important for audio index. However, in most cases the features number is huge, thus direct processing is time-consuming. Feature selection, as a preprocessing step of data mining, has turned to be very efficient in reducing dimensionality and removing irrelevant data. In this paper, we propose a feature selection algorithm based on Rough Set theory, which could find ou...

متن کامل

Beyond Locality-Sensitive Hashing

We present a new data structure for the c-approximate near neighbor problem (ANN) in the Euclidean space. For n points in R, our algorithm achieves Oc(n + d logn) query time and Oc(n + d logn) space, where ρ ≤ 7/(8c2) + O(1/c3) + oc(1). This is the first improvement over the result by Andoni and Indyk (FOCS 2006) and the first data structure that bypasses a locality-sensitive hashing lower boun...

متن کامل

Locality-Sensitive Hashing

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Fundamenta Informaticae

سال: 2021

ISSN: ['1875-8681', '0169-2968']

DOI: https://doi.org/10.3233/fi-2021-2069