Discovering Accurate and Interesting Classification Rules Using Genetic Algorithm
نویسندگان
چکیده
Discovering accurate and interesting classification rules is a significant task in the post-processing stage of a data mining (DM) process. Therefore, an optimization problem exists between the accuracy and the interesting metrics for post-processing rule sets. To achieve a balance, in this paper, we propose two major post-processing tasks. In the first task, we use a genetic algorithm (GA) to find the best combination of rules that maximizes the predictive accuracy on the sample training set. Thus we obtain the maximized accuracy. In the second task, we rank the rules by assigning objective rule interestingness (RI) measures (or weights) for the rules in the rule set. Henceforth, we propose a pruning strategy using a GA to find the best combination of interesting rules with the maximized (or greater) accuracy. We tested our implementation on three data sets. The results are very encouraging; they demonstrate the applicability and effectiveness of our approach.
منابع مشابه
A Distributed-Population Genetic Algorithm for Discovering Interesting Prediction Rules
In data mining the quality of prediction rules basically involves three criteria: accuracy, comprehensible and interestingness. The majority of the rule induction literature focuses on discovering accurate, comprehensible rules. In this paper we also take these two criteria into account, but we go beyond them in the sense that we aim at discovering rules that are interesting (surprising) for th...
متن کاملMining of Interesting Prediction Rules with Uniform Two-Level Genetic Algorithm
The main goal of data mining is to extract accurate, comprehensible and interesting knowledge from databases that may be considered as large search spaces. In this paper, a new, efficient type of genetic algorithm (GA) called uniform two-level GA is proposed as a search strategy to discover truly interesting, high-level prediction rules, a difficult problem and relatively little researched, rat...
متن کاملA Rule Extractor for Diagnosing the Type 2 Diabetes Using a Self-organizing Genetic Algorithm
Introduction: Constructing medical decision support models to automatically extract knowledge from data helps physicians in early diagnosis of disease. Interpretability of the inferential rules of these models is a key indicator in determining their performance in order to understand how they make decisions, and increase the reliability of their output. Methods: In this study, an automated hyb...
متن کاملKnowledge Acquisition tool for Classification Rules using Genetic Algorithm Approach
Classification Rule Mining (CRM) is a data mining technique for discovering important classification rules from large dataset. This work presents an efficient genetic algorithm for discovering significant IF-THEN rules from a given dataset. The proposed algorithm consists of two main steps. First step generates set of classification rules and the second step deletes the weak rules and selects o...
متن کاملKnowledge Acquisition tool for Classification Rules using Genetic Algorithm Approach
Classification Rule Mining (CRM) is a data mining technique for discovering important classification rules from large dataset. This work presents an efficient genetic algorithm for discovering significant IF-THEN rules from a given dataset. The proposed algorithm consists of two main steps. First step generates set of classification rules and the second step deletes the weak rules and selects o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006