Improving Cascade Classifier Precision by Instance Selection and Outlier Generation
نویسندگان
چکیده
Beside the curse of dimensionality and imbalanced classes, unfavorable data distributions can hamper classification accuracy. This is particularly problematic with increasing dimensionality of the classification task. A classifier that can handle high-dimensional and imbalanced data sets is the cascade classification method for time series. The cascade classifier can compound unfavorable data distributions by projecting the highdimensional data set onto low-dimensional subsets. A classifier is trained for each of the low-dimensional data subsets and their predictions are aggregated to an overall result. For the cascade classifier, the errors of each classifier accumulate in the overall result and therefore small improvements in each small classifier can improve the classification accuracy. Therefore we propose two methods for data preprocessing to improve the cascade classifier. The first method is instance selection, a technique to select representative examples for the classification task. Furthermore, artificial infeasible examples can improve classification performance. Even if high-dimensional infeasible examples are available, their projection to low-dimensional space is not possible due to projection errors. We propose a second data preprocessing method for generating artificial infeasible examples in low-dimensional space. We show for micro Combined Heat and Power plant power production time series and an artificial and complex data set that the proposed data preprocessing methods increase the performance of the cascade classifier by increasing the selectivity of the learned decision boundaries.
منابع مشابه
Iranian Vehicle License Plate Detection based on Cascade Classifier
A license plate recognition system contains three main steps: plate detection, character segmentation and character recognition. The first and foremost step of this system is the plate detection stage where the plate is located from the input image. In this paper an effective plate detection approach is developed based on a cascade classifier. A two-phase training approach is proposed to enhanc...
متن کاملImproving Accuracy in Intrusion Detection Systems Using Classifier Ensemble and Clustering
Recently by developing the technology, the number of network-based servicesis increasing, and sensitive information of users is shared through the Internet.Accordingly, large-scale malicious attacks on computer networks could causesevere disruption to network services so cybersecurity turns to a major concern fornetworks. An intrusion detection system (IDS) could be cons...
متن کاملNEW CRITERIA FOR RULE SELECTION IN FUZZY LEARNING CLASSIFIER SYSTEMS
Designing an effective criterion for selecting the best rule is a major problem in theprocess of implementing Fuzzy Learning Classifier (FLC) systems. Conventionally confidenceand support or combined measures of these are used as criteria for fuzzy rule evaluation. In thispaper new entities namely precision and recall from the field of Information Retrieval (IR)systems is adapted as alternative...
متن کاملEvaluation of Classifiers in Software Fault-Proneness Prediction
Reliability of software counts on its fault-prone modules. This means that the less software consists of fault-prone units the more we may trust it. Therefore, if we are able to predict the number of fault-prone modules of software, it will be possible to judge the software reliability. In predicting software fault-prone modules, one of the contributing features is software metric by which one ...
متن کاملInstance Selection to Improve Gamma Classifier
Pre-processing the dataset is an important stage in the Knowledge Discovery in Datasets (KDD) process. Filtering noise through instance selection is a necessary task. With this, the risk to use misclassified and non-representative instances to train supervised classifiers is reduced. This study aims at improving the performance of the Gamma associative classifier, by introducing a novel similar...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016