DBIG-US: A two-stage under-sampling algorithm to face the class imbalance problem
نویسندگان
چکیده
The class imbalance problem occurs when one far outnumbers the other classes, causing most traditional classifiers perform poorly on minority classes. To tackle this problem, a plethora of techniques have been proposed, especially centered around resampling methods. This paper introduces two-stage method that combines DBSCAN clustering algorithm to filter noisy majority instances with graph-based procedure overcome imbalance. We then experimentally evaluate behavior proposed collection two-class imbalanced data sets. experimental results show an improvement in classification performance measured by geometric mean accuracy each and also higher reduction ratio compared several state-of-the-art under-sampling techniques.
منابع مشابه
A Multiple Expert Approach to the Class Imbalance Problem Using Inverse Random under Sampling
In this paper, a novel inverse random under sampling (IRUS) method is proposed for class imbalance problem. The main idea is to severely under sample the negative class (majority class), thus creating a large number of distinct negative training sets. For each training set we then find a linear discriminant which separates the positive class from the negative class. By combining the multiple de...
متن کاملSemi Supervised Under-sampling: a Solution to the Class Imbalance Problem for Classification and Feature Selection
Most medical datasets are not balanced in their class labels. Furthermore, in some cases it has been noticed that the given class labels do not accurately represent characteristics of the data record. Most existing classification methods tend not to perform well on minority class examples when the dataset is extremely imbalanced. This is because they aim to optimize the overall accuracy without...
متن کاملthe algorithm for solving the inverse numerical range problem
برد عددی ماتریس مربعی a را با w(a) نشان داده و به این صورت تعریف می کنیم w(a)={x8ax:x ?s1} ، که در آن s1 گوی واحد است. در سال 2009، راسل کاردن مساله برد عددی معکوس را به این صورت مطرح کرده است : برای نقطه z?w(a)، بردار x?s1 را به گونه ای می یابیم که z=x*ax، در این پایان نامه ، الگوریتمی برای حل مساله برد عددی معکوس ارانه می دهیم.
15 صفحه اولAdaptive Ensemble Selection for Face Re-identification under Class Imbalance
Systems for face re-identification over a network of video surveillance cameras are designed with a limited amount of reference data, and may operate under complex environments. Furthermore, target individuals provide a small proportion of the facial captures for design and during operations, and these proportions may change over time according to operational conditions. Given a diversified poo...
متن کاملAuthor identification: Using text sampling to handle the class imbalance problem
Authorship analysis of electronic texts assists digital forensics and anti-terror investigation. Author identification can be seen as a single-label multi-class text categorization problem. Very often, there are extremely few training texts at least for some of the candidate authors or there is a significant variation in the text-length among the available training texts of the candidate author...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Expert Systems With Applications
سال: 2021
ISSN: ['1873-6793', '0957-4174']
DOI: https://doi.org/10.1016/j.eswa.2020.114301