Supervised Feature Discretization with a Dynamic Bit-Allocation Strategy
نویسندگان
چکیده
The use of feature discretization (FD) is useful in several machine learning and pattern recognition tasks. By attaining adequate and compact data representations, FD may improve the performance of many methods. It is often the case that learning with discrete data representations yields both lower training time and better accuracy, as compared to the use of the original features. Moreover, FD may also allow for a better human understanding/interpretation of the data. However, many FD techniques are sub-optimal, in the sense that they do not take into account feature interdependencies, as discretization is carried out. In this paper, we propose a dynamic supervised FD technique, addressing feature interdependencies. Our method selects the discretization cut-points by simultaneously maximizing two criteria: the dependency between the discretized features and the class label; the independence among these features. The method performs an incremental bit allocation scheme, using mutual information (MI) as the dependency/independence measure. Experimental results on low and medium-dimensional datasets show that the proposed method often achieves better accuracy than other well-known supervised FD approaches.
منابع مشابه
A secure biometric discretization scheme for face template protection
In this paper, a dynamic biometric discretization scheme based on Linnartz and Tuyls’s quantization index modulation scheme (LT-QIM) [Linnartz and Tuyls, 2003] is proposed. LT-QIM extracts one bit per feature element and takes care of the intra-class variation of the biometric features. Nevertheless, LT-QIM does not consider statistical distinctiveness between users, and thus lacks the capabili...
متن کاملOn "Soft" bit allocation
If a random variable X with variance rs’ is quantized optimally, being mapped into s discrete levels, the quantization error is roughly proportional to 6 ’ / s 2 . In many applications in speech coding and image digitization, we are given sets of random variables and we have to quantize them in some way so as to represent their realizations with as few discretized levels (bits) as possible. Thi...
متن کاملAn effective biometric discretization approach to extract highly discriminative, informative, and privacy-protective binary representation
Biometric discretization derives a binary string for each user based on an ordered set of biometric features. This representative string ought to be discriminative, informative, and privacy protective when it is employed as a cryptographic key in various security applications upon error correction. However, it is commonly believed that satisfying the first and the second criteria simultaneously...
متن کاملFinding Contrast Patterns for Mixed Streaming Data
Contrast set mining identifies patterns in the data that can best distinguish between groups. Most of the existing work focuses on categorical and batch data, and they do not scale well for large datasets. In this work, we focus on finding contrast patterns for mixed (quantitative and categorical) and streaming data. We adapt a discretization methodology, Supervised Dynamic and Adaptive Discret...
متن کاملImage Retrieval Using Dynamic Weighting of Compressed High Level Features Framework with LER Matrix
In this article, a fabulous method for database retrieval is proposed. The multi-resolution modified wavelet transform for each of image is computed and the standard deviation and average are utilized as the textural features. Then, the proposed modified bit-based color histogram and edge detectors were utilized to define the high level features. A feedback-based dynamic weighting of shap...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013