نتایج جستجو برای: imbalanced classes

تعداد نتایج: 162059  

2011
Xiao-Lin WANG Yang YANG Hai ZHAO

Imbalanced data sets have significantly unequal distributions between classes. This between-class imbalance causes conventional classification methods to favor majority classes, resulting in very low or even no detection of minority classes. A Min-Max modular support vector machine (M-SVM) approaches this problem by decomposing the training input sets of the majority classes into subsets of sim...

2007
Jerzy Stefanowski Szymon Wilk

In the paper we discuss inducing rule-based classifiers from imbalanced data, where one class (a minority class) is under-represented in comparison to the remaining classes (majority classes). To improve the ability of a classifier to recognize this class, we propose a new selective pre-processing approach that is applied to data before inducing a rule-based classifier. The approach combines se...

2013
Tao Yang Yalin Wang Hasan Davulcu Pinghua Gong Rita Chattopadhyay Jiayu Zhou Sen Yang Shuo Xiang Qian Sun Zhi Nie Cheng Pan Rashmi Dubey

Learning from high dimensional biomedical data attracts lots of attention recently. High dimensional biomedical data often suffer from the curse of dimensionality and have imbalanced class distributions. Both of these features of biomedical data, high dimensionality and imbalanced class distributions, are challenging for traditional machine learning methods and may affect the model performance....

2015
Jerzy Stefanowski

In this paper we discus improving rule based classifiers learned from class imbalanced data. Standard learning methods often do not work properly with imbalanced data as they are biased to focus on the majority classes while " disregarding " examples from the minority class. The class imbalance affects various types of classifiers, including the rule-based ones. These difficulties include two g...

2011
Josh Attenberg Şeyda Ertekin

The rich history of predictive modeling has culminated in a diverse set of techniques capable of making accurate predictions on many real-world problems. Many of these techniques demand training, whereby a set of instances with ground-truth labels (values of a dependent variable) are observed by a model-building process that attempts to capture, at least in part, the relationship between the fe...

2017
Yan Yan Tianbao Yang Yi Yang Jianhui Chen

A challenge for mining large-scale streaming data overlooked by most existing studies on online learning is the skewdistribution of examples over different classes. Many previous works have considered cost-sensitive approaches in an online setting for streaming data, where fixed costs are assigned to different classes, or ad-hoc costs are adapted based on the distribution of data received so fa...

2003
Dragos D. Margineantu

Most classification algorithms expect the frequency of examples form each class to be roughly the same. However, this is rarely the case for real-world data where very often the class probability distribution is nonuniform (or, imbalanced). For these applications, the main problem is usually the fact that the costs of misclassifying examples belonging to rare classes differ significantly from t...

Journal: :The Journal of Supercomputing 2023

Learning an unbiased classifier from imbalanced image datasets is challenging since the may be strongly biased toward majority class. To address this issue, some generative model-based oversampling methods have been proposed. However, most of these pay little attention to boundary samples, which contribute tiny learning classifier. In paper, we focus on samples and propose a similar classes lat...

Journal: :Fundam. Inform. 2008
Hongyu Guo Herna L. Viktor

Relational databases, with vast amounts of data–from financial transactions, marketing surveys, medical records, to health informatics observations– and complex schemas, are ubiquitous in our society. Multirelational classification algorithms have been proposed to learn from such relational repositories, where multiple interconnected tables (relations) are involved. These methods search for rel...

2010
Nuno Escudeiro Alipio Jorge

In some classification tasks, such as those related to the automatic building and maintenance of text corpora, it is expensive to obtain labeled examples to train a classifier. In such circumstances it is common to have massive corpora where a few examples are labeled (typically a minority) while others are not. Semi-supervised learning techniques try to leverage the intrinsic information in un...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید