Evaluating Difficulty of Multi-class Imbalanced Data

نویسندگان

  • Mateusz Lango
  • Krystyna Napierala
  • Jerzy Stefanowski
چکیده

Multi-class imbalanced classification is more difficult than its binary counterpart. Besides typical data difficulty factors, one should also consider the complexity of relations among classes. This paper introduces a new method for examining the characteristics of multi-class data. It is based on analyzing the neighbourhood of the minority class examples and on additional information about similarities between classes. The experimental study has shown that this method is able to identify the difficulty of class distribution and that the estimated minority example safe levels are related with prediction errors of standard classifiers.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Imbalanced data classification accuracy by using Fuzzy Similarity Measure and subtractive clustering

 Classification is an one of the important parts of data mining and knowledge discovery. In most cases, the data that is utilized to used to training the clusters is not well distributed. This inappropriate distribution occurs when one class has a large number of samples but while the number of other class samples is naturally inherently low. In general, the methods of solving this kind of prob...

متن کامل

Adapted ensemble classification algorithm based on multiple classifier system and feature selection for classifying multi-class imbalanced data

Learning from imbalanced data, where the number of observations in one class is significantly rarer than in other classes, has gained considerable attention in the data mining community. Most existing literature focuses on binary imbalanced case while multi-class imbalanced learning is barely mentioned. What’s more, most proposed algorithms treated all imbalanced data consistently and aimed to ...

متن کامل

On Mining Fuzzy Classification Rules for Imbalanced Data

Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...

متن کامل

An Imbalanced Data Rule Learner

Imbalanced data learning has recently begun to receive much attention from research and industrial communities as traditional machine learners no longer give satisfactory results. Solutions to the problem generally attempt to adapt standard learners to the imbalanced data setting. Basically, higher weights are assigned to small class examples to avoid their being overshadowed by the large class...

متن کامل

On Mining Fuzzy Classification Rules for Imbalanced Data

Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017