نتایج جستجو برای: imbalanced data sets

تعداد نتایج: 2531472  

2008
Jerzy Stefanowski Szymon Wilk

This papers deals with inducing rule-based classifiers from imbalanced data, where one class (a minority class) is under-represented in comparison to the remaining classes (majority classes). We discuss reasons for bias of standard classifiers toward recognition of examples from majority classes and misclassifcation of the minority class. To avoid limitations of sequential covering approaches, ...

Journal: :Neurocomputing 2014
Ming Gao Xia Hong Sheng Chen Christopher J. Harris Emad Khalaf

This contribution proposes a novel probability density function (PDF) estimation based over-sampling (PDFOS) approach for two-class imbalanced classification problems. The classical Parzen-window kernel function is adopted to estimate the PDF of the positive class. Then according to the estimated PDF, synthetic instances are generated as the additional training data. The essential concept is to...

2009
Viviane Palodeto Hernán Terenzi Jefferson Luiz Brum Marques

Protein secondary structure prediction (PSSP) is one of the main tasks in computational biology. During the last few decades, much effort has been made towards solving this problem, with various approaches, mainly artificial neural networks (ANN). Generally, in order to predict the protein secondary structure, the ANN training process is performed using CB513 data set. Like protein structures d...

2013
P. Alagambigai K. Thangavel Ashok Kumar

The common challenge which is faced by much of the data clustering techniques is data complexity, which leads to many issues such as overlapping, lack of representative data and class imbalance. This may deteriorates the clustering process. The situation gets worse when the class imbalance is very high. To cluster such imbalanced data sets, better understandings of the dataset and efficient clu...

Journal: :CoRR 2017
Dolev Raviv Margarita Osadchy

Deep Learning (DL) methods show very good performance when trained on large, balanced data sets. However, many practical problems involve imbalanced data sets, or/and classes with a small number of training samples. The performance of DL methods as well as more traditional classifiers drops significantly in such settings. Most of the existing solutions for imbalanced problems focus on customizi...

Journal: :Journal of Machine Learning Research 2015
Xingye Qiao Lingsong Zhang

Classification is an important topic in statistics and machine learning with great potential in many real applications. In this paper, we investigate two popular large-margin classification methods, Support Vector Machine (SVM) and Distance Weighted Discrimination (DWD), under two contexts: the high-dimensional, low-sample size data and the imbalanced data. A unified family of classification ma...

2012
Mohsen Rahmanian Eghbal Mansoori Mehdi Zareian Jahromi M. Zareian Jahromi

Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...

2014
Li Peng Yu Xiao-yang

This paper presents a SVM classification method based on cluster boundary sampling and sample pruning. We actively explore an effective solution to solve the difficult problem of imbalanced data set classification from data re-sampling and algorithm improving. Firstly, we creatively propose the method of cluster boundary sampling, using the clustering density threshold and the boundary density ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید