نتایج جستجو برای: imbalanced classes

تعداد نتایج: 162059  

2009
Chumphol Bunkhumpornpat Krung Sinapiromsaran Chidchanok Lursinsap

The class imbalanced problem occurs in various disciplines when one of target classes has a tiny number of instances comparing to other classes. A typical classifier normally ignores or neglects to detect a minority class due to the small number of class instances. SMOTE is one of over-sampling techniques that remedies this situation. It generates minority instances within the overlapping regio...

Journal: :CoRR 2018
Lei Xu Alexandros Iosifidis Moncef Gabbouj

In this paper, we propose a new variant of Linear Discriminant Analysis to overcome underlying drawbacks of traditional LDA and other LDA variants targeting problems involving imbalanced classes. Traditional LDA sets assumptions related to Gaussian class distribution and neglects influence of outlier classes, that might hurt in performance. We exploit intuitions coming from a probabilistic inte...

2012
Mónica Millán-Giraldo Vicente García José Salvador Sánchez

In the dissimilarity representation paradigm, several prototype selection methods have been used to cope with the topic of how to select a small representation set for generating a low-dimensional dissimilarity space. In addition, these methods have also been used to reduce the size of the dissimilarity matrix. However, these approaches assume a relatively balanced class distribution, which is ...

2003
S. B. Kotsiantis P. E. Pintelas

Many real-world data sets exhibit skewed class distributions in which almost all cases are allotted to a class and far fewer cases to a smaller, usually more interesting class. A classifier induced from an imbalanced data set has, typically, a low error rate for the majority class and an unacceptable error rate for the minority class. This paper firstly provides a systematic study on the variou...

2013
Zeping Yang Daqi Gao

In many real world applications, the example data among different pattern classes are imbalanced and overlapping, which hinder the classification performance of many learning algorithms. In this paper, data cleaning techniques based BNF (the borderline noise factor) is proposed to remove the borderline noise and three under-sampling methods are studied to select the representative majority clas...

Journal: :Expert Syst. Appl. 2015
Myoung-Jong Kim Dae-Ki Kang Hong Bae Kim

In classification or prediction tasks, data imbalance problem is frequently observed when most of instances belong to one majority class. Data imbalance problem has received considerable attention in machine learning community because it is one of the main causes that degrade the performance of classifiers or predictors. In this paper, we propose geometric mean based boosting algorithm (GMBoost...

2013
Raman Singh Harish Kumar R. K. Singla

Network traffic data is huge, varying and imbalanced because various classes are not equally distributed. Machine learning (ML) algorithms for traffic analysis uses the samples from this data to recommend the actions to be taken by the network administrators. Due to imbalances in dataset, machine learning algorithms may give biased or false results leading to serious degradation in performance ...

2016
Talayeh Razzaghi Oleg Roderick Ilya Safro Nicholas Marko

This work is motivated by the needs of predictive analytics on healthcare data as represented by Electronic Medical Records. Such data is invariably problematic: noisy, with missing entries, with imbalance in classes of interests, leading to serious bias in predictive modeling. Since standard data mining methods often produce poor performance measures, we argue for development of specialized te...

2013
Myoung-Jong Kim Dae-Ki Kang

In classification or prediction tasks, data imbalance problem is frequently observed when most of samples belong to one majority class. Data imbalance problem has received a lot of attention in machine learning community because it is one of the causes that degrade the performance of classifiers or predictors. In this paper, we propose geometric mean based boosting algorithm (GMBoost) to resolv...

2011
Peilin Zhao Steven C. H. Hoi Rong Jin Tianbao Yang

Most studies of online learning measure the performance of a learner by classification accuracy, which is inappropriate for applications where the data are unevenly distributed among different classes. We address this limitation by developing online learning algorithm for maximizing Area Under the ROC curve (AUC), a metric that is widely used for measuring the classification performance for imb...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید