Relevancy contemplation in medical data analytics and ranking of feature selection algorithms

نویسندگان

چکیده

Abstract This article performs a detailed data scrutiny on chronic kidney disease (CKD) dataset to select efficient instances and relevant features. Data relevancy is investigated using feature extraction, hybrid outlier detection, handling of missing values. that do not influence the target are removed envelopment analysis enable reduction rows. Column achieved by ranking attributes through selection methodologies, namely, extra‐trees classifier, recursive elimination, chi‐squared test, variance, mutual information. These methodologies ranked via Technique for Order Preference Similarity Ideal Solution (TOPSIS) weight optimization identify optimal features model building from CKD facilitate better prediction while diagnosing severity disease. An ensemble novel similarity‐based classifiers built pruned dataset, results thereafter compared with random forest, AdaBoost, naive Bayes, k‐nearest neighbors, support vector machines. The classifier yields accuracy 98.31% selected extra tree (ETC), which as best TOPSIS.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature Selection and Ranking

This chapter describes a method of feature selection and ranking based on human expert knowledge and training and testing of a neural network. Being computationally efficient, the method is less sensitive to round-off errors and noise in the data than the traditional methods of feature selection and ranking grounded on the sensitivity analysis. The method may lead to a significant reduction of ...

متن کامل

Toward Optimal Feature Selection Using Ranking Methods and Classification Algorithms

We presented a comparison between several feature ranking methods used on two real datasets. We considered six ranking methods that can be divided into two broad categories: statistical and entropy-based. Four supervised learning algorithms are adopted to build models, namely, IB1, Naive Bayes, C4.5 decision tree and the RBF network. We showed that the selection of ranking methods could be impo...

متن کامل

GMDH-based feature ranking and selection for improved classification of medical data

Medical applications are often characterized by a large number of disease markers and a relatively small number of data records. We demonstrate that complete feature ranking followed by selection can lead to appreciable reductions in data dimensionality, with significant improvements in the implementation and performance of classifiers for medical diagnosis. We describe a novel approach for ran...

متن کامل

A Fuzzy TOPSIS Approach for Big Data Analytics Platform Selection

Big data sizes are constantly increasing. Big data analytics is where advanced analytic techniques are applied on big data sets. Analytics based on large data samples reveals and leverages business change. The popularity of big data analytics platforms, which are often available as open-source, has not remained unnoticed by big companies. Google uses MapReduce for PageRank and inverted indexes....

متن کامل

Feature Selection and Ranking Filters

Many feature selection and feature ranking methods have been proposed. Using real and artificial data an attempt has been made to compare some of these methods. The "feature relevance index" used seems to have little effect on the relative ranking. For continuous features discretization and kernel smoothing are compared. Selection of subsets of features using hashing techniques is compared with...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Etri Journal

سال: 2022

ISSN: ['1225-6463', '2233-7326']

DOI: https://doi.org/10.4218/etrij.2022-0018