Hybridized Oscillating Search Algorithm for Unsupervised Feature Selection

نویسنده

  • D. Devakumari
چکیده

In feature selection, a search problem of finding a subset of features from a given set of measurements has been of interest for a long time. However, unsupervised methods are scarce. An unsupervised criterion, based on SVD-entropy (Singular Value Decomposition), selects a feature according to its contribution to the entropy (CE) calculated on a leave-one-out basis. Based on this criterion, this paper proposes a Hybridized Oscillating Search feature selection method (HOS) which does not follow a pre defined direction of search (forward or backward). It is a randomized search method which begins with a random subset of features. The proposed HOS method makes use of a sequential feature selection method called Simple Ranking based on CE to get the initial feature subset. Repeated modification of the subset is achieved through up and down swings which form the oscillating cycles. The up swing adds good features to the current subset while the down swing removes worst features from the current subset. After each oscillating cycle, the subset is evaluated by comparing its predictive accuracy with known classification. Common indices like Rand Index and Jaccard Coefficient are used for this purpose. If the last oscillating cycle did not find a better subset, then the process ends with the current subset.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Oscillating Search Algorithms for Feature Selection

A new sub-optimal subset search method for feature selection is introduced. As opposed to other till now known subset selection methods the oscillating search is not dependent on pre-specified direction of search (forward or backward). The generality of oscillating search concept allowed us to define several different algorithms suitable for different purposes. We can specify the need to obtain...

متن کامل

An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification

The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...

متن کامل

An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification

The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...

متن کامل

Oscillating Feature Subset Search Algorithm for Text Categorization

A major characteristic of text document categorization problems is the extremely high dimensionality of text data. In this paper we explore the usability of the Oscillating Search algorithm for feature/word selection in text categorization. We propose to use the multiclass Bhattacharyya distance for multinomial model as the global feature subset selection criterion for reducing the dimensionali...

متن کامل

A Random Forest Classifier based on Genetic Algorithm for Cardiovascular Diseases Diagnosis (RESEARCH NOTE)

Machine learning-based classification techniques provide support for the decision making process in the field of healthcare, especially in disease diagnosis, prognosis and screening. Healthcare datasets are voluminous in nature and their high dimensionality problem comprises in terms of slower learning rate and higher computational cost. Feature selection is expected to deal with the high dimen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014