نتایج جستجو برای: imbalanced data sets

تعداد نتایج: 2531472  

Journal: :CoRR 2015
Arash Pourhabib

When the training data in a two-class classification problem is overwhelmed by one class, most classification techniques fail to correctly identify the data points belonging to the underrepresented class. We propose Similarity-based Imbalanced Classification (SBIC) that learns patterns in the training data based on an empirical similarity function. To take the imbalanced structure of the traini...

2015
Jerzy Stefanowski

In this paper we discus improving rule based classifiers learned from class imbalanced data. Standard learning methods often do not work properly with imbalanced data as they are biased to focus on the majority classes while " disregarding " examples from the minority class. The class imbalance affects various types of classifiers, including the rule-based ones. These difficulties include two g...

Journal: :CAAI Transactions on Intelligence Technology 2022

In this paper, an Observation Points Classifier Ensemble (OPCE) algorithm is proposed to deal with High-Dimensional Imbalanced Classification (HDIC) problems based on data processed using the Multi-Dimensional Scaling (MDS) feature extraction technique. First, dimensionality of original imbalanced reduced MDS so that distances between any two different samples are preserved as well possible. Se...

2013
Linda Shafer Saeid Nahavandi George Zobrist George W. Arnold David Jacobson Tariq Samad Ekram Hossain Mary Lanzerotti Dmitry Goldgof HAIBO HE YUNQIAN MA Haibo He

With the continuous expansion of data availability in many large-scale, complex, and networked systems, it becomes critical to advance raw data from fundamental research on the Big Data challenge to support decision-making processes. Although existing machine-learning and data-mining techniques have shown great success in many real-world applications, learning from imbalanced data is a relative...

2013
Haiqin Yang Junjie Hu Michael R. Lyu

Imbalanced learning, or learning from imbalanced data, is a challenging problem in both academy and industry. Nowadays, the streaming imbalanced data become popular and trigger the volume, velocity, and variety issues of learning from these data. To tackle these issues, online learning algorithms are proposed to learn a linear classifier via maximizing the AUC score. However, the developed line...

2002
Anto Satriyo NUGROHO

Studies on arti cial neural network have been conducted for a long time, and its contribution has been shown in many elds. However, the application of neural networks in the real world domain is still a challenge, since nature does not always provide the required satisfactory conditions. One example is the class size imbalanced condition in which one class is heavily under-represented compared ...

Journal: :IEEE Trans. Knowl. Data Eng. 2014
Yubin Park Joydeep Ghosh

This paper introduces two kinds of decision tree ensembles for imbalanced classification problems, extensively utilizing properties of α-divergence. First, a novel splitting criterion based on α-divergence is shown to generalize several wellknown splitting criteria such as those used in C4.5 and CART. When the α-divergence splitting criterion is applied to imbalanced data, one can obtain decisi...

2017
Severin Klingler Rafael Wampfler Tanja Käser Barbara Solenthaler Markus Gross

Gathering labeled data in educational data mining (EDM) is a time and cost intensive task. However, the amount of available training data directly influences the quality of predictive models. Unlabeled data, on the other hand, is readily available in high volumes from intelligent tutoring systems and massive open online courses. In this paper, we present a semi-supervised classification pipelin...

2006
William Elazmeh Nathalie Japkowicz Stan Matwin

Evaluating classifier performance with ROC curves is popular in the machine learning community. To date, the only method to assess confidence of ROC curves is to construct ROC bands. In the case of severe class imbalance with few instances of the minority class, ROC bands become unreliable. We propose a generic framework for classifier evaluation to identify a segment of an ROC curve in which m...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید