Developing a Filter-Wrapper Feature Selection Method and its Application in Dimension Reduction of Gen Expression

نویسندگان

چکیده مقاله:

Nowadays, increasing the volume of data and the number of attributes in the dataset has reduced the accuracy of the learning algorithm and the computational complexity. A dimensionality reduction method is a feature selection method, which is done through filtering and wrapping. The wrapper methods are more accurate than filter ones but perform faster and have a less computational burden. With respect to the advantages and disadvantages of the filter and wrapper algorithms, a new hybrid approach is proposed in this study. In the method, all features in the dataset are considered, then the optimal subset of features is selected by combining the feature selection filter algorithms and evaluating their results using the wrapper method. Considering the many diseases and biosystem issues, such as cancer, can be identified and diagnosed by microarray data analysis and considering that there are many features in such datasets, the method proposed in this paper has been evaluated on microarray data related to three types of cancers.  Compared with similar methods, the results show the proposed method benefits from high accuracy in classifying and identifying the affecting factors on cancer.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A hybrid wrapper / filter approach for feature subset selection

This work presents a hybrid wrapper/filter algorithm for feature subset selection that can use a combination of several quality criteria measures to rank the set of features of a dataset. These ranked features are used to prune the search space of subsets of possible features such that the number of times the wrapper executes the learning algorithm for a dataset with M features is reduced to O(...

متن کامل

Wrapper-Filter Feature Selection Algorithm Using a Memetic Framework

This correspondence presents a novel hybrid wrapper and filter feature selection algorithm for a classification problem using a memetic framework. It incorporates a filter ranking method in the traditional genetic algorithm to improve classification performance and accelerate the search in identifying the core feature subsets. Particularly, the method adds or deletes a feature from a candidate ...

متن کامل

A Two-phase Feature Selection Method using both Filter and Wrapper

Feature selection is an integral step of data mining process to find an optimal subset of features. After examine the problems with both the filter and wrapper approach to feature selection, we propose a two-phase feature selection algorithm of filter and wrapper that can take advantage of both approaches. It begins by running GFSIC(fi1ter approach) to remove irrelevant features, then it runs S...

متن کامل

Fuzzy-rough Information Gain Ratio Approach to Filter-wrapper Feature Selection

Feature selection for various applications has been carried out for many years in many different research areas. However, there is a trade-off between finding feature subsets with minimum length and increasing the classification accuracy. In this paper, a filter-wrapper feature selection approach based on fuzzy-rough gain ratio is proposed to tackle this problem. As a search strategy, a modifie...

متن کامل

A Hybrid Both Filter and Wrapper Feature Selection Method for Microarray Classification

expression data is widely used in disease analysis and cancer diagnosis. However, since gene expression data could contain thousands of genes simultaneously, successful microarray classification is rather difficult. Feature selection is an important pre-treatment for any classification process. Selecting a useful gene subset as a classifier not only decreases the computational time and cost, bu...

متن کامل

IG-GA: A Hybrid Filter/Wrapper Method for Feature Selection of Microarray Data

Gene expression profiles have great potential as a medical diagnostic tool since they represent the state of a cell at the molecular level. Available training data sets for classification of cancer types generally have a fairly small sample size compared to the number of genes involved. This fact poses an insurmountable problem to some classification methodologies due to training data limitatio...

متن کامل

منابع من

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}


عنوان ژورنال

دوره 6  شماره 2

صفحات  48- 59

تاریخ انتشار 2017-09

با دنبال کردن یک ژورنال هنگامی که شماره جدید این ژورنال منتشر می شود به شما از طریق ایمیل اطلاع داده می شود.

کلمات کلیدی

کلمات کلیدی برای این مقاله ارائه نشده است

میزبانی شده توسط پلتفرم ابری doprax.com

copyright © 2015-2023