Incorporating Prior Knowledge into Boosting for Multi-Label Classification XiaoWang

نویسنده

  • Guo-Zheng Li
چکیده

Multi-label learning deals with the problem where each instance may belong to multiple labels simultaneously. The task of the learning paradigm is to output the label set whose size is unknown a priori for each unseen instance, through analyzing the training data set with known label sets. Existing multi-label learning algorithms are almost based on the purely data-driven method. The larger the training dataset, the better the performance of the classifier. However, in some cases, training dataset is too small to obtain an accurate model, while there are some prior knowledge available. In this paper, a novel boosting based multilabel learning algorithm called KnowBoost.MH is proposed. It is derived from the famous AdaBoost.MH algorithm by incorporating prior knowledge into boosting to compensate for the lack of training data. Experimental results on two real-world multi-label datasets show that KnowBoost.MH outperforms AdaBoost.MH and some of wellestablished multi-label learning algorithms especially in the case of the lack of training data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Online Boosting Algorithms for Multi-label Ranking

We consider the multi-label ranking approach to multilabel learning. Boosting is a natural method for multilabel ranking as it aggregates weak predictions through majority votes, which can be directly used as scores to produce a ranking of the labels. We design online boosting algorithms with provable loss bounds for multi-label ranking. We show that our first algorithm is optimal in terms of t...

متن کامل

Exploiting Associations between Class Labels in Multi-label Classification

Multi-label classification has many applications in the text categorization, biology and medical diagnosis, in which multiple class labels can be assigned to each training instance simultaneously. As it is often the case that there are relationships between the labels, extracting the existing relationships between the labels and taking advantage of them during the training or prediction phases ...

متن کامل

MLIFT: Enhancing Multi-label Classifier with Ensemble Feature Selection

Multi-label classification has gained significant attention during recent years, due to the increasing number of modern applications associated with multi-label data. Despite its short life, different approaches have been presented to solve the task of multi-label classification. LIFT is a multi-label classifier which utilizes a new strategy to multi-label learning by leveraging label-specific ...

متن کامل

Oil Reservoirs Classification Using Fuzzy Clustering (RESEARCH NOTE)

Enhanced Oil Recovery (EOR) is a well-known method to increase oil production from oil reservoirs. Applying EOR to a new reservoir is a costly and time consuming process. Incorporating available knowledge of oil reservoirs in the EOR process eliminates these costs and saves operational time and work. This work presents a universal method to apply EOR to reservoirs based on the available data by...

متن کامل

The Boosting Approach to Machine Learning An Overview

Boosting is a general method for improving the accuracy of any given learning algorithm. Focusing primarily on the AdaBoost algorithm, this chapter overviews some of the recent work on boosting including analyses of AdaBoost’s training error and generalization error; boosting’s connection to game theory and linear programming; the relationship between boosting and logistic regression; extension...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011