نتایج جستجو برای: imbalanced data sets

تعداد نتایج: 2531472  

Journal: :CoRR 2015
Paula Branco Luís Torgo Rita P. Ribeiro

Many real world data mining applications involve obtaining predictive models using data sets with strongly imbalanced distributions of the target variable. Frequently, the least common values of this target variable are associated with events that are highly relevant for end users (e.g. fraud detection, unusual returns on stock markets, anticipation of catastrophes, etc.). Moreover, the events ...

Journal: :Pattern Recognition 2015
Zhongbin Sun Qinbao Song Xiaoyan Zhu Heli Sun Baowen Xu Yuming Zhou

The class imbalance problems have been reported to severely hinder classification performance of many standard learning algorithms, and have attracted a great deal of attention from researchers of different fields. Therefore, a number of methods, such as sampling methods, cost-sensitive learning methods, and bagging and boosting based ensemble methods, have been proposed to solve these problems...

Journal: :Journal of chemical information and modeling 2013
Chia-Yun Chang Ming-Tsung Hsu Emilio Xavier Esposito Yufeng J. Tseng

The traditional biological assay is very time-consuming, and thus the ability to quickly screen large numbers of compounds against a specific biological target is appealing. To speed up the biological evaluation of compounds, high-throughput screening is widely used in the fields of biomedical, biological information, and drug discovery. The research presented in this study focuses on the use o...

2011
William Klement Szymon Wilk Wojtek Michalowski Stan Matwin

Learning from data with severe class imbalance is difficult. Established solutions include: under-sampling, adjusting classification threshold, and using an ensemble. We examine the performance of combining these solutions to balance the sensitivity and specificity for binary classifications, and to reduce the MSE score for probability estimation.

Journal: :Knowl.-Based Syst. 2016
Yijing Li Haixiang Guo Xiao Liu Yanan Li Jinling Li

Learning from imbalanced data, where the number of observations in one class is significantly rarer than in other classes, has gained considerable attention in the data mining community. Most existing literature focuses on binary imbalanced case while multi-class imbalanced learning is barely mentioned. What’s more, most proposed algorithms treated all imbalanced data consistently and aimed to ...

Journal: :Inf. Sci. 2008
Mu-Chen Chen Long-Sheng Chen Chun-Chin Hsu Wei-Rong Zeng

Recently, the class imbalance problem has attracted much attention from researchers in the field of data mining. When learning from imbalanced data in which most examples are labeled as one class and only few belong to another class, traditional data mining approaches do not have a good ability to predict the crucial minority instances. Unfortunately, many real world data sets like health exami...

Journal: :Bio-medical materials and engineering 2015
Ke Cheng Qingfang Chen Xibei Yang Shang Gao Hualong Yu

To address the imbalanced classification problem emerging in Bioinformatics, a boundary movement-based extreme learning machine (ELM) algorithm called BM-ELM was proposed. BM-ELM tries to firstly explore the prior information about data distribution by condensing all training instances into the one-dimensional feature space corresponding to the original output in ELM, and then on the transforme...

Journal: :Genome informatics. International Conference on Genome Informatics 2003
Aik Choon Tan David Gilbert Yves Deville

Protein structure classification represents an important process in understanding the associations between sequence and structure as well as possible functional and evolutionary relationships. Recent structural genomics initiatives and other high-throughput experiments have populated the biological databases at a rapid pace. The amount of structural data has made traditional methods such as man...

2014
Can Liu Sandra Kübler Ning Yu

Sentiment analysis generally uses large feature sets based on a bag-of-words approach, which results in a situation where individual features are not very informative. In addition, many data sets tend to be heavily skewed. We approach this combination of challenges by investigating feature selection in order to reduce the large number of features to those that are discriminative. We examine the...

2011
Guohua Liang Chengqi Zhang

This study investigates the performance of bagging in terms of learning from imbalanced medical data. It is important for data miners to achieve highly accurate prediction models, and this is especially true for imbalanced medical applications. In these situations, practitioners are more interested in the minority class than the majority class; however, it is hard for a traditional supervised l...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید