ASN-SMOTE: a synthetic minority oversampling method with adaptive qualified synthesizer selection

نویسندگان

چکیده

Abstract Oversampling is a promising preprocessing technique for imbalanced datasets which generates new minority instances to balance the dataset. However, improper generated instances, i.e., noise may interfere learning of classifier and impact it negatively. Given this, in this paper, we propose simple effective oversampling approach known as ASN-SMOTE based on k -nearest neighbors synthetic technology (SMOTE). first filters class by determining whether nearest neighbor each instance belongs or majority class. After that, uses effectively perceive decision boundary, inside qualified are selected adaptively proposed adaptive selection scheme synthesize instance. To substantiate effectiveness, has been applied three different classifiers comprehensive experiments have conducted 24 benchmark datasets. also extensively compared with nine notable algorithms. The results show that achieves best implementation available at: https://www.github.com/yixinkai123/ASN-SMOTE/ .

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

RBM-SMOTE: Restricted Boltzmann Machines for Synthetic Minority Oversampling Technique

The problem of imbalanced data, i.e., when the class labels are unequally distributed, is encountered in many real-life application, e.g., credit scoring, medical diagnostics. Various approaches aimed at dealing with the imbalanced data have been proposed. One of the most well known data pre-processing method is the Synthetic Minority Oversampling Technique (SMOTE). However, SMOTE may generate ...

متن کامل

SMOTE: Synthetic Minority Over-sampling Technique

An approach to the construction of classifiers from imbalanced datasets is described. A dataset is imbalanced if the classification categories are not approximately equally represented. Often real-world data sets are predominately composed of “normal” examples with only a small percentage of “abnormal” or “interesting” examples. It is also the case that the cost of misclassifying an abnormal (i...

متن کامل

A Two-Step Feature Selection Method to Predict Cancerlectins by Multiview Features and Synthetic Minority Oversampling Technique

Cancerlectins have an inhibitory effect on the growth of cancer cells and are currently being employed as therapeutic agents. The accurate identification of the cancerlectins should provide insight into the molecular mechanisms of cancers. In this study, a new computational method based on the RF (Random Forest) algorithm is proposed for further improving the performance of identifying cancerle...

متن کامل

A Synthetic Minority Oversampling Method Based on Local Densities in Low-Dimensional Space for Imbalanced Learning

Imbalanced class distribution is a challenging problem in many real-life classification problems. Existing synthetic oversampling do suffer from the curse of dimensionality because they rely heavily on Euclidean distance. This paper proposed a new method, called Minority Oversampling Technique based on Local Densities in Low-Dimensional Space (or MOT2LD in short). MOT2LD first maps each trainin...

متن کامل

Geometric SMOTE: Effective oversampling for imbalanced learning through a geometric extension of SMOTE

Classification of imbalanced datasets is a challenging task for standard algorithms. Although many methods exist to address this problem in different ways, generating artificial data for the minority class is a more general approach compared to algorithmic modifications. SMOTE algorithm and its variations generate synthetic samples along a line segment that joins minority class instances. In th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Complex & Intelligent Systems

سال: 2022

ISSN: ['2198-6053', '2199-4536']

DOI: https://doi.org/10.1007/s40747-021-00638-w