Discovering Accurate and Interesting Classification Rules Using Genetic Algorithm

نویسندگان

  • Janaki Gopalan
  • Reda Alhajj
  • Ken Barker
چکیده

Discovering accurate and interesting classification rules is a significant task in the post-processing stage of a data mining (DM) process. Therefore, an optimization problem exists between the accuracy and the interesting metrics for post-processing rule sets. To achieve a balance, in this paper, we propose two major post-processing tasks. In the first task, we use a genetic algorithm (GA) to find the best combination of rules that maximizes the predictive accuracy on the sample training set. Thus we obtain the maximized accuracy. In the second task, we rank the rules by assigning objective rule interestingness (RI) measures (or weights) for the rules in the rule set. Henceforth, we propose a pruning strategy using a GA to find the best combination of interesting rules with the maximized (or greater) accuracy. We tested our implementation on three data sets. The results are very encouraging; they demonstrate the applicability and effectiveness of our approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Distributed-Population Genetic Algorithm for Discovering Interesting Prediction Rules

In data mining the quality of prediction rules basically involves three criteria: accuracy, comprehensible and interestingness. The majority of the rule induction literature focuses on discovering accurate, comprehensible rules. In this paper we also take these two criteria into account, but we go beyond them in the sense that we aim at discovering rules that are interesting (surprising) for th...

متن کامل

Mining of Interesting Prediction Rules with Uniform Two-Level Genetic Algorithm

The main goal of data mining is to extract accurate, comprehensible and interesting knowledge from databases that may be considered as large search spaces. In this paper, a new, efficient type of genetic algorithm (GA) called uniform two-level GA is proposed as a search strategy to discover truly interesting, high-level prediction rules, a difficult problem and relatively little researched, rat...

متن کامل

A Rule Extractor for Diagnosing the Type 2 Diabetes Using a Self-organizing Genetic Algorithm

Introduction: Constructing medical decision support models to automatically extract knowledge from data helps physicians in early diagnosis of disease. Interpretability of the inferential rules of these models is a key indicator in determining their performance in order to understand how they make decisions, and increase the reliability of their output. Methods: In this study, an automated hyb...

متن کامل

Knowledge Acquisition tool for Classification Rules using Genetic Algorithm Approach

Classification Rule Mining (CRM) is a data mining technique for discovering important classification rules from large dataset. This work presents an efficient genetic algorithm for discovering significant IF-THEN rules from a given dataset. The proposed algorithm consists of two main steps. First step generates set of classification rules and the second step deletes the weak rules and selects o...

متن کامل

Knowledge Acquisition tool for Classification Rules using Genetic Algorithm Approach

Classification Rule Mining (CRM) is a data mining technique for discovering important classification rules from large dataset. This work presents an efficient genetic algorithm for discovering significant IF-THEN rules from a given dataset. The proposed algorithm consists of two main steps. First step generates set of classification rules and the second step deletes the weak rules and selects o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006