Machine Learning Framework for the Prediction of Alzheimer’s Disease Using Gene Expression Data Based on Efficient Gene Selection

نویسندگان

چکیده

In recent years, much research has focused on using machine learning (ML) for disease prediction based gene expression (GE) data. However, many diseases have received considerable attention, whereas some, including Alzheimer’s (AD), not, perhaps due to data shortage. The present work is intended fill this gap by introducing a symmetric framework predict AD from GE data, with the aim produce most accurate smallest number of genes. works in four stages after it receives training dataset: pre-processing, selection (GS), classification, and prediction. symmetry model manifested all its stages. pre-processing stage columns dataset are pre-processed identically. GS stage, same user-defined filter metrics invoked every individually, so wrapper metrics. classification ML models applied identically minimal set genes selected preceding stage. core proposed meticulous algorithm which we designed nominate eight subsets original provided dataset. Exploring subsets, selects best one describe AD, also subset. For credible results, calculates performance repeated stratified k-fold cross validation. To evaluate framework, used an 1157 cases 39,280 genes, obtained combining smaller public datasets. were split two partitions, 1000 training/testing, 10-fold CV 30 times, 157 From testing/training phase, identified only 1058 be relevant support vector (SVM) these final validation, that never seen SVM classifier. evaluation, evaluated classifier via six metrics, impressive values. Specifically, 0.97, 0.98, 0.945, 0.972, 0.975 sensitivity (recall), specificity, precision, kappa index, AUC, accuracy, respectively.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prediction of blood cancer using leukemia gene expression data and sparsity-based gene selection methods

Background: DNA microarray is a useful technology that simultaneously assesses the expression of thousands of genes. It can be utilized for the detection of cancer types and cancer biomarkers. This study aimed to predict blood cancer using leukemia gene expression data and a robust ℓ2,p-norm sparsity-based gene selection method. Materials and Methods: In this descriptive study, the microarray ...

متن کامل

Feature Selection and Classification of Microarray Gene Expression Data of Ovarian Carcinoma Patients using Weighted Voting Support Vector Machine

We can reach by DNA microarray gene expression to such wealth of information with thousands of variables (genes). Analysis of this information can show genetic reasons of disease and tumor differences. In this study we try to reduce high-dimensional data by statistical method to select valuable genes with high impact as biomarkers and then classify ovarian tumor based on gene expression data of...

متن کامل

Classification and Biomarker Genes Selection for Cancer Gene Expression Data Using Random Forest

Background & objective: Microarray and next generation sequencing (NGS) data are the important sources to find helpful molecular patterns. Also, the great number of gene expression data increases the challenge of how to identify the biomarkers associated with cancer. The random forest (RF) is used to effectively analyze the problems of large-p and smal...

متن کامل

Oral Cancer Prediction Using Gene Expression Profiling and Machine Learning

Oral premalignant lesion (OPL) patients have a high risk of developing oral cancer. In this study we investigate using machine learning techniques with gene expression profiling to predict the possibility of oral cancer development in OPL patients. Four classification techniques were used: support vector machine (SVM), Regularized Least Squares (RLS), multi-layer perceptron (MLP) with back prop...

متن کامل

Using Rule-Based Machine Learning for Candidate Disease Gene Prioritization and Sample Classification of Cancer Gene Expression Data

Microarray data analysis has been shown to provide an effective tool for studying cancer and genetic diseases. Although classical machine learning techniques have successfully been applied to find informative genes and to predict class labels for new samples, common restrictions of microarray analysis such as small sample sizes, a large attribute space and high noise levels still limit its scie...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Symmetry

سال: 2022

ISSN: ['0865-4824', '2226-1877']

DOI: https://doi.org/10.3390/sym14030491