feature subset selection

Feature Selection Methods and Algorithms

2011

L. Ladha

Feature selection is an important topic in data mining, especially for high dimensional datasets. Feature selection (also known as subset selection) is a process commonly used in machine learning, wherein subsets of the features available from the data are selected for application of a learning algorithm. The best subset contains the least number of dimensions that most contribute to accuracy; ...

متن کامل

Evaluation of Feature Subset Selection, Feature Weighting, and Prototype Selection for Biomedical Applications

2008

Suzanne Little Ovidio Salvetti Petra Perner

Many medical diagnosis applications are characterized by datasets that contain under-represented classes due to the fact that the disease is much rarer than the normal case. In such a situation classifiers such as decision trees and Naïve Bayesian that generalize over the data are not the proper choice as classification methods. Case-based classifiers that can work on the samples seen so far ar...

متن کامل

On the Feature Selection Criterion Proposed in ‘Gait Feature Subset Selection by Mutual Information’

2009

Kiran S. Balagani Vir. V. Phoha S. S. Iyengar N. Balakrishnan

Abstract Recently, Guo and Nixon [1] proposed a feature selection method based on maximizing I(x; Y ), the multidimensional mutual information between feature vector x and class variable Y . Because computing I(x; Y ) can be difficult in practice, Guo and Nixon proposed an approximation of I(x; Y ) as the criterion for feature selection. We show that Guo and Nixon’s criterion originates from ap...

متن کامل

Development of a Pharmacogenomics Model based on Support Vector Regression with Optimal Features Selection Approach to Determine the Initial Therapeutic Dose of Warfarin Anticoagulant Drug

ژورنال: مجله انفورماتیک سلامت و زیست پزشکی 2023

Maghsoudi , Rouhollah, Mirzarezaee, Mitra, Najar-Araabi , Babak, Sadeghi , Mehdi,

Introduction: Using artificial intelligence tools in pharmacogenomics is one of the latest bioinformatics research fields. One of the most important drugs that determining its initial therapeutic dose is difficult is the anticoagulant warfarin. Warfarin is an oral anticoagulant that, due to its narrow therapeutic window and complex interrelationships of individual factors, the selection of its ...

متن کامل

Formulation of Feature Selection with Support Vector Machine

2015

Gend Lal Prajapati

Basic question arises when classification came in picture classification accuracy, ensemble size, and computational complexity. Feature selection is importance for improvement and performance of classification algorithm. Classification algorithm may not scale up to the size of the full feature set either in sample or time but with feature selection help us to better understand the domain with C...

متن کامل

Trace Ratio Criterion for Feature Selection

2008

Feiping Nie Shiming Xiang Yangqing Jia Changshui Zhang Shuicheng Yan

Fisher score and Laplacian score are two popular feature selection algorithms, both of which belong to the general graph-based feature selection framework. In this framework, a feature subset is selected based on the corresponding score (subset-level score), which is calculated in a trace ratio form. Since the number of all possible feature subsets is very huge, it is often prohibitively expens...

متن کامل

Selecting feature subset for high dimensional data via the propositional FOIL rules

Journal: :Pattern Recognition 2013

Guangtao Wang Qinbao Song Baowen Xu Yuming Zhou

Feature interaction is an important issue in feature subset selection. However, most of the existing algorithms only focus on dealing with irrelevant and redundant features. In this paper, a propositional FOIL rule based algorithm FRFS, which not only retains relevant features and excludes irrelevant and redundant ones but also considers feature interaction, is proposed for selecting feature su...

متن کامل

A Novel Approach to Feature Selection Using PageRank algorithm for Web Page Classification

Journal: Journal of Advances in Computer Research 2019

Farhad Rezvani, Farhad Soleimanian Gharehchopogh,

In this paper, a novel filter-based approach is proposed using the PageRank algorithm to select the optimal subset of features as well as to compute their weights for web page classification. To evaluate the proposed approach multiple experiments are performed using accuracy score as the main criterion on four different datasets, namely WebKB, Reuters-R8, Reuters-R52, and 20NewsGroups. By analy...

متن کامل

MIFS-ND: A mutual information-based feature selection method

Journal: :Expert Syst. Appl. 2014

Nazrul Hoque Dhruba Kumar Bhattacharyya Jugal K. Kalita

Feature selection is used to choose a subset of relevant features for effective classification of data. In high dimensional data classification, the performance of a classifier often depends on the feature subset used for classification. In this paper, we introduce a greedy feature selection method using mutual information. This method combines both feature–feature mutual information and featur...

متن کامل

A Fast Clustering-based Feature Subset Selection Algorithm

2015

Akshay S. Agrawal Sachin Bojewar

The paper aims at proposing the fast clustering algorithm for eliminating irrelevant and redundant data. Feature selection is applied to reduce the number of features in many applications where data has hundreds or thousands of features. Existing feature selection methods mainly focus on finding relevant features. In this paper, we show that feature relevance alone is insufficient for efficient...

متن کامل