imbalanced data

Flexible high-dimensional classification machines and their asymptotic properties

Journal: :Journal of Machine Learning Research 2015

Xingye Qiao Lingsong Zhang

Classification is an important topic in statistics and machine learning with great potential in many real applications. In this paper, we investigate two popular large-margin classification methods, Support Vector Machine (SVM) and Distance Weighted Discrimination (DWD), under two contexts: the high-dimensional, low-sample size data and the imbalanced data. A unified family of classification ma...

متن کامل

A Hybrid Weighted Nearest Neighbor Approach to Mine Imbalanced Data

2016

Harshita Patel G. S. Thakur

Classification of imbalanced data has drawn significant attention from research community in last decade. As the distribution of data into various classes affects the performances of traditional classifiers, the imbalanced data needs special treatment. Modification in learning approaches is one of the solutions to deal with such cases. In this paper a hybrid nearest neighbor learning approach i...

متن کامل

A Review on Imbalanced Learning Methods

2015

Varsha S. Babar Roshani Ade T. E. Fawcett

Nowadays learning from imbalanced data sets are a relatively a very critical task for many data mining applications such as fraud detection, anomaly detection, medical diagnosis, information retrieval systems. The imbalanced learning problem is nothing but unequal distribution of data between the classes where one class contains more and more samples while another contains very little. Because ...

متن کامل

A Comparative Study of Decision Tree Algorithms for Class Imbalanced Learning in Credit Card Fraud Detection

2015

Maira Anis Mohsin Ali

Credit card fraud detection along with its inherent property of class imbalance is one of the major challenges faced by the financial institutions. Many classifiers are used for the fraud detection of imbalanced data. Imbalanced data withhold the performance of classifiers by setting up the overall accuracy as a performance measure. This makes the decision to be biased towards the majority clas...

متن کامل

Feature Selection and Granularity Learning in Genetic Fuzzy Rule-Based Classification Systems for Highly Imbalanced Data-Sets

Journal: :International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 2012

Pedro Villar Alberto Fernández Ramón Alberto Carrasco Francisco Herrera

This paper proposes a Genetic Algorithm for jointly performing a feature selection and granularity learning for Fuzzy Rule-Based Classification Systems in the scenario of highly imbalanced data-sets. We refer to imbalanced data-sets when the class distribution is not uniform, a situation that it is present in many real application areas. The aim of this work is to get more compact models by sel...

متن کامل

Improving activated sludge classification based on imbalanced data

Journal: :Journal of Hydroinformatics 2014

متن کامل

Automatic Fall Risk Detection Based on Imbalanced Data

Journal: :IEEE Access 2021

In recent years, the declining birthrate and aging population have gradually brought countries into an ageing society. Regarding accidents that occur amongst elderly, falls are essential problem quickly causes indirect physical loss. this paper, we propose a pose estimation-based fall detection algorithm to detect risks. We use body ratio, acceleration deflection as key features instead of usin...

متن کامل

Cost-Sensitive Support Vector Ranking for Information Retrieval

Journal: :JCIT 2010

Fengxia Wang Xiao Chang

In recent years, the algorithms of learning to rank have been proposed by researchers. However, in information retrieval, instances of ranks are imbalanced. After the instances of ranks are composed to pairs, the pairs of ranks are imbalanced too. In this paper, a cost-sensitive risk minimum model of pairwise learning to rank imbalanced data sets is proposed. Following this model, the algorithm...

متن کامل

A Preliminar Analysis of CO2RBFN in Imbalanced Problems

2009

M. Dolores Pérez-Godoy Antonio J. Rivera Alberto Fernández María José del Jesús Francisco Herrera

In many real classification problems the data are imbalanced, i.e., the number of instances for some classes are much higher than that of the other classes. Solving a classification task using such an imbalanced data-set is difficult due to the bias of the training towards the majority classes. The aim of this contribution is to analyse the performance of CORBFN, a cooperative-competitive evolu...

متن کامل

WTEN: An Advanced Coupled Tensor Factorization Strategy for Learning from Imbalanced Data

2016

Quan Do Thanh Pham Wei Liu Kotagiri Ramamohanarao

Learning from imbalanced and sparse data in multi-mode and high-dimensional tensor formats efficiently is a significant problem in data mining research. On one hand, Coupled Tensor Factorization (CTF) has become one of the most popular methods for joint analysis of heterogeneous sparse data generated from different sources. On the other hand, techniques such as sampling, cost-sensitive learning...

متن کامل