A new imputation method based on genetic programming and weighted KNN for symbolic regression with incomplete data
نویسندگان
چکیده
Incompleteness is one of the problematic data quality challenges in real-world machine learning tasks. A large number studies have been conducted for addressing this challenge. However, most existing focus on classification task and only a limited symbolic regression with missing values exist. In work, new imputation method incomplete proposed. The aims to improve both effectiveness efficiency imputing regression. This based genetic programming (GP) weighted K-nearest neighbors (KNN). It constructs GP-based models using other available features predict features. instances used constructing such are selected KNN. experimental results sets show that proposed outperforms state-of-the-art methods respect accuracy, performance, time.
منابع مشابه
A new imputation method for incomplete binary data
In data analysis problems where the data are represented by vectors of real numbers, it is often the case that some of the data points will have “missing values”, meaning that one or more of the entries of the vector that describes the data point is not known. In this paper, we propose a new approach to the imputation of missing binary values. The technique we introduce employs a “similarity me...
متن کاملGenetic Programming for Symbolic Regression
Genetic programming (GP) is a supervised learning method motivated by an analogy to biological evolution. GP creates successor hypotheses by repeatedly mutating and crossovering parts of the current best hypotheses, with expectation to find a good solution in the evolution process. In this report, the task to be performed was a symbolic regression problem, which is to find the symbolic function...
متن کاملSequential Symbolic Regression with Genetic Programming
This chapter describes the Sequential Symbolic Regression (SSR) method, a new strategy for function approximation in symbolic regression. The SSR method is inspired by the sequential covering strategy from machine learning, but instead of sequentially reducing the size of the problem being solved, it sequentially transforms the original problem into potentially simpler problems. This transforma...
متن کاملKNN Regression as Geo-Imputation Method for Spatio-Temporal Wind Data
The shift from traditional energy systems to distributed systems of energy suppliers and consumers and the power volatileness in renewable energy imply the need for e↵ective short-term prediction models. These machine learning models are based on measured sensor information. In practice, sensors might fail for several reasons. The prediction models cannot naturally cannot work properly with inc...
متن کاملData Mining using Genetic Programming Classification and Symbolic Regression
proefschrift ter verkrijging van de graad van Doctor aan de Universiteit Leiden, op gezag van de Rector Magnificus Dr. Promotiecommissie Promotor: The work in this thesis has been carried out under the auspices of the research school IPA (Institute for Programming research and Algorithmics).
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Soft Computing
سال: 2021
ISSN: ['1433-7479', '1432-7643']
DOI: https://doi.org/10.1007/s00500-021-05590-y