Learning from Sparse Data by Exploiting Monotonicity Constraints
نویسندگان
چکیده
When training data is sparse, more domain knowledge must be incorporated into the learning algorithm in order to reduce the effective size of the hypothesis space. This paper builds on previous work in which knowledge about qualitative monotonicities was formally represented and incorporated into learning algorithms (e.g., Clark & Matwin’s work with the CN2 rule learning algorithm). We show how to interpret knowledge of qualitative influences, and in particular of monotonicities, as constraints on probability distributions, and to incorporate this knowledge into Bayesian network learning algorithms. We show that this yields improved accuracy, particularly with very small training sets (e.g. less than 10 examples).
منابع مشابه
Exploiting Monotonicity via Logistic Regression in Bayesian Network Learning
An important challenge in machine learning is to find ways of learning quickly from very small amounts of training data. The only way to learn from small data samples is to constrain the learning process by exploiting background knowledge. In this report, we present a theoretical analysis on the use of constrained logistic regression for estimating conditional probability distribution in Bayesi...
متن کاملBayesian Optimization Using Monotonicity Information and Its Application in Machine Learning Hyperparameter
We propose an algorithm for a family of optimization problems where the objective can be decomposed as a sum of functions with monotonicity properties. The motivating problem is optimization of hyperparameters of machine learning algorithms, where we argue that the objective, validation error, can be decomposed as monotonic functions of the hyperparameters. Our proposed algorithm adapts Bayesia...
متن کاملby machine learning
1. Objectives: The aim was to evaluate the potential for monotonicity constraints to bias machine learning systems to learn rules that were both accurate and meaningful. 2. Methods: Two data sets from problems as diverse as screening for dementia and assessing the risk of mental retardation were collected and a rule learning system with and without monotonicity constraints was run on each. The ...
متن کاملAcceptance of rules generated by machine learning among medical experts.
OBJECTIVES The aim was to evaluate the potential for monotonicity constraints to bias machine learning systems to learn rules that were both accurate and meaningful. METHODS Two data sets, taken from problems as diverse as screening for dementia and assessing the risk of mental retardation, were collected and a rule learning system, with and without monotonicity constraints, was run on each. ...
متن کاملGene Identification from Microarray Data for Diagnosis of Acute Myeloid and Lymphoblastic Leukemia Using a Sparse Gene Selection Method
Background: Microarray experiments can simultaneously determine the expression of thousands of genes. Identification of potential genes from microarray data for diagnosis of cancer is important. This study aimed to identify genes for the diagnosis of acute myeloid and lymphoblastic leukemia using a sparse feature selection method. Materials and Methods: In this descriptive study, the expressio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005