Improving Cascade Classifier Precision by Instance Selection and Outlier Generation

نویسندگان

  • Judith Neugebauer
  • Oliver Kramer
  • Michael Sonnenschein
چکیده

Beside the curse of dimensionality and imbalanced classes, unfavorable data distributions can hamper classification accuracy. This is particularly problematic with increasing dimensionality of the classification task. A classifier that can handle high-dimensional and imbalanced data sets is the cascade classification method for time series. The cascade classifier can compound unfavorable data distributions by projecting the highdimensional data set onto low-dimensional subsets. A classifier is trained for each of the low-dimensional data subsets and their predictions are aggregated to an overall result. For the cascade classifier, the errors of each classifier accumulate in the overall result and therefore small improvements in each small classifier can improve the classification accuracy. Therefore we propose two methods for data preprocessing to improve the cascade classifier. The first method is instance selection, a technique to select representative examples for the classification task. Furthermore, artificial infeasible examples can improve classification performance. Even if high-dimensional infeasible examples are available, their projection to low-dimensional space is not possible due to projection errors. We propose a second data preprocessing method for generating artificial infeasible examples in low-dimensional space. We show for micro Combined Heat and Power plant power production time series and an artificial and complex data set that the proposed data preprocessing methods increase the performance of the cascade classifier by increasing the selectivity of the learned decision boundaries.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Iranian Vehicle License Plate Detection based on Cascade Classifier

A license plate recognition system contains three main steps: plate detection, character segmentation and character recognition. The first and foremost step of this system is the plate detection stage where the plate is located from the input image. In this paper an effective plate detection approach is developed based on a cascade classifier. A two-phase training approach is proposed to enhanc...

متن کامل

Improving Accuracy in Intrusion Detection Systems Using Classifier Ensemble and Clustering

Recently by developing the technology, the number of network-based servicesis increasing, and sensitive information of users is shared through the Internet.Accordingly, large-scale malicious attacks on computer networks could causesevere disruption to network services so cybersecurity turns to a major concern fornetworks. An intrusion detection system (IDS) could be cons...

متن کامل

NEW CRITERIA FOR RULE SELECTION IN FUZZY LEARNING CLASSIFIER SYSTEMS

Designing an effective criterion for selecting the best rule is a major problem in theprocess of implementing Fuzzy Learning Classifier (FLC) systems. Conventionally confidenceand support or combined measures of these are used as criteria for fuzzy rule evaluation. In thispaper new entities namely precision and recall from the field of Information Retrieval (IR)systems is adapted as alternative...

متن کامل

Evaluation of Classifiers in Software Fault-Proneness Prediction

Reliability of software counts on its fault-prone modules. This means that the less software consists of fault-prone units the more we may trust it. Therefore, if we are able to predict the number of fault-prone modules of software, it will be possible to judge the software reliability. In predicting software fault-prone modules, one of the contributing features is software metric by which one ...

متن کامل

Instance Selection to Improve Gamma Classifier

Pre-processing the dataset is an important stage in the Knowledge Discovery in Datasets (KDD) process. Filtering noise through instance selection is a necessary task. With this, the risk to use misclassified and non-representative instances to train supervised classifiers is reduced. This study aims at improving the performance of the Gamma associative classifier, by introducing a novel similar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016