The Truth is In There - Rule Extraction from Opaque Models Using Genetic Programming

نویسندگان

  • Ulf Johansson
  • Rikard König
  • Lars Niklasson
چکیده

A common problem when using complicated models for prediction and classification is that the complexity of the model entails that it is hard, or impossible, to interpret. For some scenarios this might not be a limitation, since the priority is the accuracy of the model. In other situations the limitations might be severe, since additional aspects are important to consider; e.g. comprehensibility or scalability of the model. In this study we show how the gap between accuracy and other aspects can be bridged by using a rule extraction method (termed G-REX) based on genetic programming. The extraction method is evaluated against the five criteria accuracy, comprehensibility, fidelity, scalability and generality. It is also shown how G-REX can create novel representation languages; here regression trees and fuzzy rules. The problem used is a data-mining problem from the marketing domain where the impact of advertising is predicted from investment plans. Several experiments, covering both regression and classification tasks, are evaluated. Results show that G-REX in general is capable of extracting both accurate and comprehensible representations, thus allowing high performance also in domains where comprehensibility is of essence.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimization of Dez dam reservoir operation using genetic algorithm

Water reservoir programming studies aim to determine the final cultivated land area based on predefined agricultural models and water requirements. Dam utilization rule curve is also provided in such studies. The system of Dez dam water resources was simulated applying the basic information in order to determine the capability of its reservoir to provide the objectives of the performed plan. As...

متن کامل

A Rule Extractor for Diagnosing the Type 2 Diabetes Using a Self-organizing Genetic Algorithm

Introduction: Constructing medical decision support models to automatically extract knowledge from data helps physicians in early diagnosis of disease. Interpretability of the inferential rules of these models is a key indicator in determining their performance in order to understand how they make decisions, and increase the reliability of their output. Methods: In this study, an automated hyb...

متن کامل

Modeling Ghotour-Chai River’s Rainfall-Runoff process by Genetic Programming

Considering the importance of water and computing the amount of rainfall runoff resulted from precipitation in recent decades, using appropriate methods for predicting the amount of runoff from rainfall date has been really essential. Rainfall-runoff models are used to estimate runoff generated from precipitation in the catchment area. Rainfall-runoff process is totally a non-linear phenomenon....

متن کامل

Using Genetic Programming to Increase Rule Quality

Rule extraction is a technique aimed at transforming highly accurate opaque models like neural networks into comprehensible models without losing accuracy. G-REX is a rule extraction technique based on Genetic Programming that previously has performed well in several studies. This study has two objectives, to evaluate two new fitness functions for G-REX and to show how G-REX can be used as a ru...

متن کامل

DAMAGE AND PLASTICITY CONSTANTS OF CONVENTIONAL AND HIGH-STRENGTH CONCRETE PART II: STATISTICAL EQUATION DEVELOPMENT USING GENETIC PROGRAMMING

Several researchers have proved that the constitutive models of concrete based on combination of continuum damage and plasticity theories are able to reproduce the major aspects of concrete behavior. A problem of such damage-plasticity models is associated with the material constants which are needed to be determined before using the model. These constants are in fact the connectors of constitu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004