نتایج جستجو برای: imbalanced data sets

تعداد نتایج: 2531472  

Journal: :Applied sciences 2021

The history of gravitational classification started in 1977. Over the years, approaches have reached many extensions, which were adapted into different problems. This article is next stage research concerning algorithms creating data particles by their geometrical divide. In previous analyses it was established that Geometrical Divide (GD) method outperforms algorithm based on classes a compoun...

2007
FLORIAN VERHEIN SANJAY CHAWLA Florian Verhein Sanjay Chawla

The application of association rule mining to classification has led to a new family of classifiers which are often referred to as “Associative Classifiers (ACs)”. The advantage of ACs is that they are rule-based and thus lend themselves to an easier interpretation. Another advantage that ACs enjoy is that they are based on a global search criterion, unlike other rule-based classifiers – e.g. d...

2015
Zejin Ding ZEJIN DING YANQING ZHANG

In this dissertation, the problem of learning from highly imbalanced data is studied. Imbalance data learning is of great importance and challenge in many real applications. Dealing with a minority class normally needs new concepts, observations and solutions in order to fully understand the underlying complicated models. We try to systematically review and solve this special learning task in t...

Journal: :Statistical Analysis and Data Mining 2008
Shohei Hido Hisashi Kashima

Imbalanced class problems appear in many real applications of classification learning. We propose a novel sampling method to improve bagging for data sets with skewed class distributions. In our new sampling method “Roughly Balanced Bagging” (RB Bagging), the number of samples in the largest and smallest classes are different, but they are effectively balanced when averaged over all subsets, wh...

2013
Tao Yang Yalin Wang Hasan Davulcu Pinghua Gong Rita Chattopadhyay Jiayu Zhou Sen Yang Shuo Xiang Qian Sun Zhi Nie Cheng Pan Rashmi Dubey

Learning from high dimensional biomedical data attracts lots of attention recently. High dimensional biomedical data often suffer from the curse of dimensionality and have imbalanced class distributions. Both of these features of biomedical data, high dimensionality and imbalanced class distributions, are challenging for traditional machine learning methods and may affect the model performance....

Journal: :Mathematical and Computer Modelling 2011
Zhuangyuan Zhao Ping Zhong Yaohong Zhao

As a kernel-based method, whether the selected kernel matches the data determines the performance of support vector machine. Conventional support vector classifiers are not suitable to the imbalanced learning tasks since they tend to classify the instances to the majority class which is the less important class. In this paper, we propose a weighted maximum margin criterion to optimize the data-...

Journal: :CoRR 2014
Talayeh Razzaghi Ilya Safro

Solving different types of optimization models (including parameters fitting) for support vector machines on largescale training data is often an expensive computational task. This paper proposes a multilevel algorithmic framework that scales efficiently to very large data sets. Instead of solving the whole training set in one optimization process, the support vectors are obtained and gradually...

Journal: :Knowl.-Based Syst. 2013
Alberto Fernández Victoria López Mikel Galar María José del Jesús Francisco Herrera

0950-7051/$ see front matter 2013 Elsevier B.V. A http://dx.doi.org/10.1016/j.knosys.2013.01.018 ⇑ Corresponding author. Tel.: +34 953 213016; fax: E-mail addresses: [email protected] (A. ugr.es (V. López), [email protected] (M. Galar Jesus), [email protected] (F. Herrera). The imbalanced class problem is related to the real-world application of classification in engineering....

Journal: :Journal of Machine Learning Research 2015
Arash Pourhabib Bani K. Mallick Yu Ding

We propose an algorithm for two-class classification problems when the training data are imbalanced. This means the number of training instances in one of the classes is so low that the conventional classification algorithms become ineffective in detecting the minority class. We present a modification of the kernel Fisher discriminant analysis such that the imbalanced nature of the problem is e...

2013
Guohua Liang

Most traditional supervised classification learning algorithms are ineffective for highly imbalanced time series classification, which has received considerably less attention than imbalanced data problems in data mining and machine learning research. Bagging is one of the most effective ensemble learning methods, yet it has drawbacks on highly imbalanced data. Sampling methods are considered t...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید