Stroke Risk Prediction: Comparing Different Sampling Algorithms
نویسندگان
چکیده
Stroke is a serious disease that has significant impact on the quality of life and safety patients. Accurately predicting stroke risk great significance for preventing treating stroke. In past few years, machine learning methods have shown potential in risk. However, due to imbalance data challenges feature selection model selection, prediction still faces some difficulties.This article aims compare performance differences between different sampling algorithms prediction. This study used over-sampling algorithm (Random Over Sampling SMOTE), under-sampling Under ENN), hybrid (SMOTE-ENN), combined them with common such as K-Nearest Neighbors, Logistic Regression, Decision Tree Support Vector Machine build model.Through analysis experimental results, found SMOTE LR showed good prediction, high F1 score. addition, this overall undersampling better than oversampling algorithms.These research results provide useful references foundation further application. Future can continue explore more algorithms, methods, engineering techniques improve accuracy interpretability promote its application clinical practice.
منابع مشابه
Comparing the Effectiveness of Machine Learning Algorithms for Defect Prediction
Software repositories with defect logs are main resource for defect prediction. In recent years, researchers have used the vast amount of data that is contained by software repositories to predict the location of defect in the code that caused problem. In this paper machine learning approach is used for predicting the modules with defect for embedded data set. Public datasets from the promise r...
متن کاملSelective sampling algorithms for cost-sensitive multiclass prediction
In this paper, we study the problem of active learning for cost-sensitive multiclass classification. We propose selective sampling algorithms, which process the data in a streaming fashion, querying only a subset of the labels. For these algorithms, we analyze the regret and label complexity when the labels are generated according to a generalized linear model. We establish that the gains of ac...
متن کاملEvaluating and comparing algorithms for respiratory motion prediction.
In robotic radiosurgery, it is necessary to compensate for systematic latencies arising from target tracking and mechanical constraints. This compensation is usually achieved by means of an algorithm which computes the future target position. In most scientific works on respiratory motion prediction, only one or two algorithms are evaluated on a limited amount of very short motion traces. The p...
متن کاملComparing Three Data Mining Algorithms for Identifying the Associated Risk Factors of Type 2 Diabetes
Background: Increasing the prevalence of type 2 diabetes has given rise to a global health burden and a concern among health service providers and health administrators. The current study aimed at developing and comparing some statistical models to identify the risk factors associated with type 2 diabetes. In this light, artificial neural network (ANN), support vector machines (SVMs), and multi...
متن کاملRisk Adjustment of Ischemic Stroke Outcomes for Comparing Hospital Performance
Background and Purpose—Stroke is the fourth-leading cause of death and a leading cause of long-term major disability in the United States. Measuring outcomes after stroke has important policy implications. The primary goals of this consensus statement are to (1) review statistical considerations when evaluating models that define hospital performance in providing stroke care; (2) discuss the be...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Advanced Computer Science and Applications
سال: 2023
ISSN: ['2158-107X', '2156-5570']
DOI: https://doi.org/10.14569/ijacsa.2023.01406115