A Wavelet PM2.5 Prediction System Using Optimized Kernel Extreme Learning with Boruta-XGBoost Feature Selection

نویسندگان

چکیده

The fine particulate matter (PM2.5) concentration has been a vital source of info and an essential indicator for measuring studying the other air pollutants. It is crucial to realize more accurate predictions PM2.5 establish high-accuracy prediction model due their social impacts cross-field applications in geospatial engineering. To further boost accuracy results, this paper proposes new wavelet system (called WD-OSMSSA-KELM model) based on new, improved variant salp swarm algorithm (OSMSSA), kernel extreme learning machine (KELM), decomposition, Boruta-XGBoost (B-XGB) feature selection. First, we applied B-XGB selection best features predicting hourly concentrations. Then, decomposition (WD) reach multi-scale results single-branch reconstruction concentrations mitigate error produced by time series data. In next stage, optimized parameters KELM under each reconstructed component. An version SSA proposed higher performance basic optimizer avoid local stagnation problems. work, propose operators oppositional-based simplex-based search core problems conventional SSA. addition, utilized time-varying parameter instead main exploration trends SSA, using random leaders guide towards regions space conditional structure. After optimizing model, was predict concentrations, different metrics were evaluate model’s accuracy. evaluated database, six pollutants, meteorological collected from Beijing Municipal Environmental Monitoring Center. experimental show that WD-OLMSSA-KELM can with superior (R: 0.995, RMSE: 11.906, MdAE: 2.424, MAPE: 9.768, KGE: 0.963, R2: 0.990) compared WD-CatBoost, WD-LightGBM, WD-Xgboost, WD-Ridge methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Boruta - A System for Feature Selection

Machine learning methods are often used to classify objects described by hundreds of attributes; in many applications of this kind a great fraction of attributes may be totally irrelevant to the classification problem. Even more, usually one cannot decide a priori which attributes are relevant. In this paper we present an improved version of the algorithm for identification of the full set of t...

متن کامل

Feature Selection with the Boruta Package

This article describes a R package Boruta, implementing a novel feature selection algorithm for finding all relevant variables. The algorithm is designed as a wrapper around a Random Forest classification algorithm. It iteratively removes the features which are proved by a statistical test to be less relevant than random probes. The Boruta package provides a convenient interface to the algorith...

متن کامل

Stepwise Feature Selection Using Multiple Kernel Learning

In this paper we propose a novel more flexible approach for the simultaneous feature selection and classification using Support Vector Machine and recent major advances of it, namely Multiple Kernel Learning. Using a quite simple kernel assembly scheme in the following paper we will indicate that feature selection and classification could be done in one step without applying computationally int...

متن کامل

A Hybrid Random Forests-boruta Feature Selection Algorithm for Biodegradibility Prediction

The a priori knowledge about biodegradability is adopted to save time and money for research and design of new products. Quantitative structure activity relationship (QSAR) models as a tool for biodegradability prediction of chemicals have been encouraged by environmental organizations. In the current work, a new algorithm has been proposed to investigate the importance of chemical descriptors ...

متن کامل

Multiple Indefinite Kernel Learning for Feature Selection

Multiple kernel learning for feature selection (MKLFS) utilizes kernels to explore complex properties of features and performs better in embedded methods. However, the kernels in MKL-FS are generally limited to be positive definite. In fact, indefinite kernels often emerge in actual applications and can achieve better empirical performance. But due to the non-convexity of indefinite kernels, ex...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Mathematics

سال: 2022

ISSN: ['2227-7390']

DOI: https://doi.org/10.3390/math10193566