Diagnosis of Diabetes Using a Random Forest Algorithm

Authors

  • Kozegar, Ehsan Computer Engineering, Faculty of Engineering, University of Guilan, Iran
  • Ravaei, Bahman ComputerEngineering, Faculty of Engineering, Yasouj University, Iran
Abstract:

Background: Diabetes is the fourth leading cause of death in the world. And because so many people around the world have the disease, or are at risk for it, diabetes can be called the disease of the century. Diabetes has devastating effects on the health of people in the community and if diagnosed late, it can cause irreparable damage to vision, kidneys, heart, arteries and so on. Therefore, it is necessary to have methods to diagnose this disease in the early stages. In this article, data mining is used to diagnose diabetes. Methods: The main algorithm used in this paper is the random forest algorithm. To evaluate the efficiency of the proposed algorithm in diagnosing diabetes, a data set was used that included 768 samples (patients) and had 8 characteristics. Because the stochastic forest algorithm is a hybrid algorithm created from several decision trees, it achieves high accuracy in diagnosing diabetes. Results: Using this algorithm, we were able to increase the accuracy of diabetes diagnosis to 99.86%. Conclusion: Diabetes is the fourth leading cause of death in the world. Different algorithms have been used to diagnose this disease. We tried to use an algorithm that has a very high degree of accuracy compared to other algorithms for diagnosing this disease.

Upgrade to premium to download articles

Sign up to access the full text

Already have an account?login

similar resources

A Random Forest Classifier based on Genetic Algorithm for Cardiovascular Diseases Diagnosis (RESEARCH NOTE)

Machine learning-based classification techniques provide support for the decision making process in the field of healthcare, especially in disease diagnosis, prognosis and screening. Healthcare datasets are voluminous in nature and their high dimensionality problem comprises in terms of slower learning rate and higher computational cost. Feature selection is expected to deal with the high dimen...

full text

Prediction and Diagnosis of Diabetes Mellitus using a Water Wave Optimization Algorithm

Data mining is an appropriate way to discover information and hidden patterns in large amounts of data, where the hidden patterns cannot be easily discovered in normal ways. One of the most interesting applications of data mining is the discovery of diseases and disease patterns through investigating patients' records. Early diagnosis of diabetes can reduce the effects of this devastating disea...

full text

A Random Forest Turbulence Prediction Algorithm

Unlike traditional pilot reports, in-situ EDR reports of atmospheric turbulence from commercial aircraft contain both positive and negative instances, are reported regularly, and have relatively accurate positions and timestamps. These data therefore make it feasible to perform more sophisticated analyses of the causes of atmospheric turbulence than were formerly possible. Several real-time gri...

full text

Prediction of PKCθ Inhibitory Activity Using the Random Forest Algorithm

This work is devoted to the prediction of a series of 208 structurally diverse PKCθ inhibitors using the Random Forest (RF) based on the Mold(2) molecular descriptors. The RF model was established and identified as a robust predictor of the experimental pIC(50) values, producing good external R(2) (pred) of 0.72, a standard error of prediction (SEP) of 0.45, for an external prediction set of 51...

full text

Classification of genome data using Random Forest Algorithm: Review

Random Forest is a popular machine learning tool for classification of large datasets. The Dataset classified with Random Forest Algorithm (RF) are correlated and the interaction between the features leads to the study of genome interaction. The review is about RF with respect to its variable selection property which reduces the large datasets into relevant samples and predicting the accuracy f...

full text

Prognosis of multiple sclerosis disease using data mining approaches random forest and support vector machine based on genetic algorithm

Background: Multiple sclerosis (MS) is a degenerative inflammatory disease which is most commonly diagnosed by magnetic resonance imaging (MRI). But, since the MRI device uses of a magnetic field, if there are metal objects in the patient's body, it can disrupt the health of the patient, the functioning of the MRI, and distortion in the images. Due to limitations of using MRI device, screening ...

full text

My Resources

Save resource for easier access later

Save to my library Already added to my library

{@ msg_add @}


Journal title

volume 21  issue 2

pages  92- 100

publication date 2021-07

By following a journal you will be notified via email when a new issue of this journal is published.

Keywords

No Keywords

Hosted on Doprax cloud platform doprax.com

copyright © 2015-2023