IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v10y2022i19p3566-d929588.html
   My bibliography  Save this article

A Wavelet PM2.5 Prediction System Using Optimized Kernel Extreme Learning with Boruta-XGBoost Feature Selection

Author

Listed:
  • Ali Asghar Heidari

    (School of Surveying and Geospatial Engineering, College of Engineering, University of Tehran, Tehran 1439957131, Iran)

  • Mehdi Akhoondzadeh

    (Photogrammetry and Remote Sensing Department, School of Surveying and Geospatial Engineering, College of Engineering, University of Tehran, North Amirabad Ave., Tehran 1439957131, Iran)

  • Huiling Chen

    (Department of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou 325035, China)

Abstract

The fine particulate matter (PM2.5) concentration has been a vital source of info and an essential indicator for measuring and studying the concentration of other air pollutants. It is crucial to realize more accurate predictions of PM2.5 and establish a high-accuracy PM2.5 prediction model due to their social impacts and cross-field applications in geospatial engineering. To further boost the accuracy of PM2.5 prediction results, this paper proposes a new wavelet PM2.5 prediction system (called WD-OSMSSA-KELM model) based on a new, improved variant of the salp swarm algorithm (OSMSSA), kernel extreme learning machine (KELM), wavelet decomposition, and Boruta-XGBoost (B-XGB) feature selection. First, we applied the B-XGB feature selection to realize the best features for predicting hourly PM2.5 concentrations. Then, we applied the wavelet decomposition (WD) algorithm to reach the multi-scale decomposition results and single-branch reconstruction of PM2.5 concentrations to mitigate the prediction error produced by time series data. In the next stage, we optimized the parameters of the KELM model under each reconstructed component. An improved version of the SSA is proposed to reach higher performance for the basic SSA optimizer and avoid local stagnation problems. In this work, we propose new operators based on oppositional-based learning and simplex-based search to mitigate the core problems of the conventional SSA. In addition, we utilized a time-varying parameter instead of the main parameter of the SSA. To further boost the exploration trends of SSA, we propose using the random leaders to guide the swarm towards new regions of the feature space based on a conditional structure. After optimizing the model, the optimized model was utilized to predict the PM2.5 concentrations, and different error metrics were applied to evaluate the model’s performance and accuracy. The proposed model was evaluated based on an hourly database, six air pollutants, and six meteorological features collected from the Beijing Municipal Environmental Monitoring Center. The experimental results show that the proposed WD-OLMSSA-KELM model can predict the PM2.5 concentration with superior performance (R: 0.995, RMSE: 11.906, MdAE: 2.424, MAPE: 9.768, KGE: 0.963, R 2 : 0.990) compared to the WD-CatBoost, WD-LightGBM, WD-Xgboost, and WD-Ridge methods.

Suggested Citation

  • Ali Asghar Heidari & Mehdi Akhoondzadeh & Huiling Chen, 2022. "A Wavelet PM2.5 Prediction System Using Optimized Kernel Extreme Learning with Boruta-XGBoost Feature Selection," Mathematics, MDPI, vol. 10(19), pages 1-35, September.
  • Handle: RePEc:gam:jmathe:v:10:y:2022:i:19:p:3566-:d:929588
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/10/19/3566/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/10/19/3566/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Jianzhou Wang & Tong Niu & Rui Wang, 2017. "Research and Application of an Air Quality Early Warning System Based on a Modified Least Squares Support Vector Machine and a Cloud Model," IJERPH, MDPI, vol. 14(3), pages 1-33, March.
    2. Yaolin Lin & Jiale Zou & Wei Yang & Chun-Qing Li, 2018. "A Review of Recent Advances in Research on PM 2.5 in China," IJERPH, MDPI, vol. 15(3), pages 1-29, March.
    3. Guangyuan Xing & Er-long Zhao & Chengyuan Zhang & Jing Wu & Giancarlo Consolo, 2021. "A Decomposition-Ensemble Approach with Denoising Strategy for PM2.5 Concentration Forecasting," Discrete Dynamics in Nature and Society, Hindawi, vol. 2021, pages 1-13, April.
    4. Kursa, Miron B. & Rudnicki, Witold R., 2010. "Feature Selection with the Boruta Package," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 36(i11).
    5. Zaher Mundher Yaseen & Hossam Faris & Nadhir Al-Ansari, 2020. "Hybridized Extreme Learning Machine Model with Salp Swarm Algorithm: A Novel Predictive Model for Hydrological Application," Complexity, Hindawi, vol. 2020, pages 1-14, February.
    6. Ren, Hao & Li, Jun & Chen, Huiling & Li, ChenYang, 2021. "Adaptive levy-assisted salp swarm algorithm: Analysis and optimization case studies," Mathematics and Computers in Simulation (MATCOM), Elsevier, vol. 181(C), pages 380-409.
    7. Abbassi, Abdelkader & Abbassi, Rabeh & Heidari, Ali Asghar & Oliva, Diego & Chen, Huiling & Habib, Arslan & Jemli, Mohamed & Wang, Mingjing, 2020. "Parameters identification of photovoltaic cell models using enhanced exploratory salp chains-based approach," Energy, Elsevier, vol. 198(C).
    8. Pei Du & Jianzhou Wang & Wendong Yang & Tong Niu, 2022. "A novel hybrid fine particulate matter (PM2.5) forecasting and its further application system: Case studies in China," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 41(1), pages 64-85, January.
    9. Fan, Junliang & Ma, Xin & Wu, Lifeng & Zhang, Fucang & Yu, Xiang & Zeng, Wenzhi, 2019. "Light Gradient Boosting Machine: An efficient soft computing model for estimating daily reference evapotranspiration with local and external meteorological data," Agricultural Water Management, Elsevier, vol. 225(C).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yamashiro, Hirochika & Nonaka, Hirofumi, 2021. "Estimation of processing time using machine learning and real factory data for optimization of parallel machine scheduling problem," Operations Research Perspectives, Elsevier, vol. 8(C).
    2. Yanzhao Wang & Jianfei Cao, 2023. "Examining the Effects of Socioeconomic Development on Fine Particulate Matter (PM2.5) in China’s Cities Based on Spatial Autocorrelation Analysis and MGWR Model," IJERPH, MDPI, vol. 20(4), pages 1-23, February.
    3. Tong, Jianfeng & Liu, Zhenxing & Zhang, Yong & Zheng, Xiujuan & Jin, Junyang, 2023. "Improved multi-gate mixture-of-experts framework for multi-step prediction of gas load," Energy, Elsevier, vol. 282(C).
    4. Asma Shaheen & Javed Iqbal, 2018. "Spatial Distribution and Mobility Assessment of Carcinogenic Heavy Metals in Soil Profiles Using Geostatistics and Random Forest, Boruta Algorithm," Sustainability, MDPI, vol. 10(3), pages 1-20, March.
    5. Gennadiy Stroykov & Alexey Y. Cherepovitsyn & Elizaveta A. Iamshchikova, 2020. "Powering Multiple Gas Condensate Wells in Russia’s Arctic: Power Supply Systems Based on Renewable Energy Sources," Resources, MDPI, vol. 9(11), pages 1-15, November.
    6. Ramón Ferri-García & María del Mar Rueda, 2022. "Variable selection in Propensity Score Adjustment to mitigate selection bias in online surveys," Statistical Papers, Springer, vol. 63(6), pages 1829-1881, December.
    7. Ghosh, Indranil & Chaudhuri, Tamal Datta & Alfaro-Cortés, Esteban & Gámez, Matías & García, Noelia, 2022. "A hybrid approach to forecasting futures prices with simultaneous consideration of optimality in ensemble feature selection and advanced artificial intelligence," Technological Forecasting and Social Change, Elsevier, vol. 181(C).
    8. Nan Jia & Yinshuai Li & Ruishan Chen & Hongbo Yang, 2023. "A Review of Global PM 2.5 Exposure Research Trends from 1992 to 2022," Sustainability, MDPI, vol. 15(13), pages 1-15, July.
    9. Ook Lee & Hanseon Joo & Hayoung Choi & Minjong Cheon, 2022. "Proposing an Integrated Approach to Analyzing ESG Data via Machine Learning and Deep Learning Algorithms," Sustainability, MDPI, vol. 14(14), pages 1-14, July.
    10. Manuel J. García Rodríguez & Vicente Rodríguez Montequín & Francisco Ortega Fernández & Joaquín M. Villanueva Balsera, 2019. "Public Procurement Announcements in Spain: Regulations, Data Analysis, and Award Price Estimator Using Machine Learning," Complexity, Hindawi, vol. 2019, pages 1-20, November.
    11. Sangjin Kim & Jong-Min Kim, 2019. "Two-Stage Classification with SIS Using a New Filter Ranking Method in High Throughput Data," Mathematics, MDPI, vol. 7(6), pages 1-16, May.
    12. Arjan S. Gosal & Janine A. McMahon & Katharine M. Bowgen & Catherine H. Hoppe & Guy Ziv, 2021. "Identifying and Mapping Groups of Protected Area Visitors by Environmental Awareness," Land, MDPI, vol. 10(6), pages 1-14, May.
    13. Ahmed Ginidi & Sherif M. Ghoneim & Abdallah Elsayed & Ragab El-Sehiemy & Abdullah Shaheen & Attia El-Fergany, 2021. "Gorilla Troops Optimizer for Electrically Based Single and Double-Diode Models of Solar Photovoltaic Systems," Sustainability, MDPI, vol. 13(16), pages 1-28, August.
    14. Zhao-Yue Chen & Hervé Petetin & Raúl Fernando Méndez Turrubiates & Hicham Achebak & Carlos Pérez García-Pando & Joan Ballester, 2024. "Population exposure to multiple air pollutants and its compound episodes in Europe," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    15. Bram Janssens & Matthias Bogaert & Mathijs Maton, 2023. "Predicting the next Pogačar: a data analytical approach to detect young professional cycling talents," Annals of Operations Research, Springer, vol. 325(1), pages 557-588, June.
    16. Cooray, Upul & Watt, Richard G. & Tsakos, Georgios & Heilmann, Anja & Hariyama, Masanori & Yamamoto, Takafumi & Kuruppuarachchige, Isuruni & Kondo, Katsunori & Osaka, Ken & Aida, Jun, 2021. "Importance of socioeconomic factors in predicting tooth loss among older adults in Japan: Evidence from a machine learning analysis," Social Science & Medicine, Elsevier, vol. 291(C).
    17. Simon Besnard & Nuno Carvalhais & M Altaf Arain & Andrew Black & Benjamin Brede & Nina Buchmann & Jiquan Chen & Jan G P W Clevers & Loïc P Dutrieux & Fabian Gans & Martin Herold & Martin Jung & Yoshik, 2019. "Memory effects of climate and vegetation affecting net ecosystem CO2 fluxes in global forests," PLOS ONE, Public Library of Science, vol. 14(2), pages 1-22, February.
    18. Francesco Sartor & Jonathan P. Moore & Hans-Peter Kubis, 2021. "Plasma Interleukin-10 and Cholesterol Levels May Inform about Interdependences between Fitness and Fatness in Healthy Individuals," IJERPH, MDPI, vol. 18(4), pages 1-19, February.
    19. Mohamed Abdel-Basset & Reda Mohamed & Ripon K. Chakrabortty & Michael J. Ryan & Attia El-Fergany, 2021. "An Improved Artificial Jellyfish Search Optimizer for Parameter Identification of Photovoltaic Models," Energies, MDPI, vol. 14(7), pages 1-33, March.
    20. Wei Xue & Qingming Zhan & Qi Zhang & Zhonghua Wu, 2019. "Spatiotemporal Variations of Particulate and Gaseous Pollutants and Their Relations to Meteorological Parameters: The Case of Xiangyang, China," IJERPH, MDPI, vol. 17(1), pages 1-23, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:10:y:2022:i:19:p:3566-:d:929588. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.