IDEAS home Printed from https://ideas.repec.org/a/gam/jsusta/v17y2025i12p5274-d1673993.html

Comparison of Selected Ensemble Supervised Learning Algorithms Used for Meteorological Normalisation of Particulate Matter (PM 10 )

Author

Listed:
  • Karolina Gora

    (Faculty of Geo-Data Science, Geodesy and Environmental Engineering, AGH University of Krakow, 30-059 Kraków, Poland)

  • Mateusz Rzeszutek

    (Faculty of Geo-Data Science, Geodesy and Environmental Engineering, AGH University of Krakow, 30-059 Kraków, Poland)

Abstract

Air pollution, particularly PM 10 particulate matter, poses significant health risks related to respiratory and cardiovascular diseases as well as cancer. Accurate identification of PM 10 reduction factors is therefore essential for developing effective sustainable development strategies. According to the current state of knowledge, machine learning methods are most frequently employed for this purpose due to their superior performance compared to classical statistical approaches. This study evaluated the performance of three machine learning algorithms—Decision Tree (CART), Random Forest, and Cubist Rule—in predicting PM 10 concentrations and estimating long-term trends following meteorological normalisation. The research focused on Tarnów, Poland (2010–2022), with comprehensive consideration of meteorological variability. The results demonstrated superior accuracy for the Random Forest and Cubist models (R 2 ~0.88–0.89, RMSE ~14 μg/m 3 ) compared to CART (RMSE 19.96 μg/m 3 ). Air temperature and boundary layer height emerged as the most significant predictive variables across all algorithms. The Cubist algorithm proved particularly effective in detecting the impact of policy interventions, making it valuable for air quality trend analysis. While the study confirmed a statistically significant annual decrease in PM 10 concentrations (0.83–1.03 μg/m 3 ), pollution levels still exceeded both the updated EU air quality standards from 2024 (Directive (EU) 2024/2881), which will come into force in 2030, and the more stringent WHO guidelines from 2021.

Suggested Citation

  • Karolina Gora & Mateusz Rzeszutek, 2025. "Comparison of Selected Ensemble Supervised Learning Algorithms Used for Meteorological Normalisation of Particulate Matter (PM 10 )," Sustainability, MDPI, vol. 17(12), pages 1-16, June.
  • Handle: RePEc:gam:jsusta:v:17:y:2025:i:12:p:5274-:d:1673993
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2071-1050/17/12/5274/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2071-1050/17/12/5274/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Matthew A. Cole & Robert J R Elliott & Bowen Liu, 2020. "The Impact of the Wuhan Covid-19 Lockdown on Air Pollution and Health: A Machine Learning and Augmented Synthetic Control Approach," Environmental & Resource Economics, Springer;European Association of Environmental and Resource Economists, vol. 76(4), pages 553-580, August.
    2. Wright, Marvin N. & Ziegler, Andreas, 2017. "ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 77(i01).
    3. Sandra Ceballos-Santos & Jaime González-Pardo & David C. Carslaw & Ana Santurtún & Miguel Santibáñez & Ignacio Fernández-Olmo, 2021. "Meteorological Normalisation Using Boosted Regression Trees to Estimate the Impact of COVID-19 Restrictions on Air Quality Levels," IJERPH, MDPI, vol. 18(24), pages 1-18, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Luis A Barboza & Shu-Wei Chou-Chen & Paola Vásquez & Yury E García & Juan G Calvo & Hugo G Hidalgo & Fabio Sanchez, 2023. "Assessing dengue fever risk in Costa Rica by using climate variables and machine learning techniques," PLOS Neglected Tropical Diseases, Public Library of Science, vol. 17(1), pages 1-13, January.
    2. repec:osf:osfxxx:s8ayp_v1 is not listed on IDEAS
    3. Augusto Cerqua & Roberta Di Stefano, 2022. "When did coronavirus arrive in Europe?," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 31(1), pages 181-195, March.
    4. Pekka Malo & Juha Eskelinen & Xun Zhou & Timo Kuosmanen, 2024. "Computing Synthetic Controls Using Bilevel Optimization," Computational Economics, Springer;Society for Computational Economics, vol. 64(2), pages 1113-1136, August.
    5. Bokelmann, Björn & Lessmann, Stefan, 2024. "Improving uplift model evaluation on randomized controlled trial data," European Journal of Operational Research, Elsevier, vol. 313(2), pages 691-707.
    6. Joel Podgorski & Oliver Kracht & Luis Araguas-Araguas & Stefan Terzer-Wassmuth & Jodie Miller & Ralf Straub & Rolf Kipfer & Michael Berg, 2024. "Groundwater vulnerability to pollution in Africa’s Sahel region," Nature Sustainability, Nature, vol. 7(5), pages 558-567, May.
    7. Nayiri Galestian Pour & Soudabeh Shemehsavar, 2024. "Learning from high dimensional data based on weighted feature importance in decision tree ensembles," Computational Statistics, Springer, vol. 39(1), pages 313-342, February.
    8. Bazyli Czyżewski & Jakub Staniszewski & Joanna Staniszewska & Marta Guth, 2025. "Does Increasing Agricultural Efficiency Contribute to Food Security—Trade‐Offs of Value Addition in Crop Production?," Sustainable Development, John Wiley & Sons, Ltd., vol. 33(S1), pages 939-970, November.
    9. David Dorn & Florian Schoner & Moritz Seebacher & Lisa Simon & Ludger Woessmann, 2024. "Multidimensional Skills on LinkedIn Profiles: Measuring Human Capital and the Gender Skill Gap," Papers 2409.18638, arXiv.org, revised May 2025.
    10. Albert Stuart Reece & Gary Kenneth Hulse, 2022. "European Epidemiological Patterns of Cannabis- and Substance-Related Congenital Neurological Anomalies: Geospatiotemporal and Causal Inferential Study," IJERPH, MDPI, vol. 20(1), pages 1-35, December.
    11. Chen, Jianbao & Shen, Jiamin & Ke, Nan, 2025. "Assessing the impact of new energy demonstration city policy on industrial carbon intensity using machine learning," Economic Analysis and Policy, Elsevier, vol. 87(C), pages 1690-1707.
    12. Van Belle, Jente & Guns, Tias & Verbeke, Wouter, 2021. "Using shared sell-through data to forecast wholesaler demand in multi-echelon supply chains," European Journal of Operational Research, Elsevier, vol. 288(2), pages 466-479.
    13. Philipp Bach & Victor Chernozhukov & Malte S. Kurz & Martin Spindler & Sven Klaassen, 2021. "DoubleML -- An Object-Oriented Implementation of Double Machine Learning in R," Papers 2103.09603, arXiv.org, revised Jun 2024.
    14. Marchetto, Elisa & Da Re, Daniele & Tordoni, Enrico & Bazzichetto, Manuele & Zannini, Piero & Celebrin, Simone & Chieffallo, Ludovico & Malavasi, Marco & Rocchini, Duccio, 2023. "Testing the effect of sample prevalence and sampling methods on probability- and favourability-based SDMs," Ecological Modelling, Elsevier, vol. 477(C).
    15. repec:plo:pone00:0233806 is not listed on IDEAS
    16. Nicholas Spyrison & Dianne Cook & Przemyslaw Biecek, 2025. "Exploring local explanations of nonlinear models using animated linear projections," Computational Statistics, Springer, vol. 40(2), pages 1071-1095, February.
    17. Michael Bucker & Gero Szepannek & Alicja Gosiewska & Przemyslaw Biecek, 2020. "Transparency, Auditability and eXplainability of Machine Learning Models in Credit Scoring," Papers 2009.13384, arXiv.org.
    18. Jian Lu & Raheel Ahmad & Thomas Nguyen & Jeffrey Cifello & Humza Hemani & Jiangyuan Li & Jinguo Chen & Siyi Li & Jing Wang & Achouak Achour & Joseph Chen & Meagan Colie & Ana Lustig & Christopher Dunn, 2022. "Heterogeneity and transcriptome changes of human CD8+ T cells across nine decades of life," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    19. Timo Schulte & Tillmann Wurz & Oliver Groene & Sabine Bohnet-Joschko, 2023. "Big Data Analytics to Reduce Preventable Hospitalizations—Using Real-World Data to Predict Ambulatory Care-Sensitive Conditions," IJERPH, MDPI, vol. 20(6), pages 1-16, March.
    20. Brandon Hayes & Timothée Vergne & Nicolas Rose & Cristian Mortasivu & Mathieu Andraud, 2026. "A multi-host mechanistic model of African swine fever emergence and control in Romania," Nature Communications, Nature, vol. 17(1), pages 1-10, December.
    21. Bennett, Donyetta & Mekelburg, Erik & Strauss, Jack & Williams, T.H., 2024. "Unlocking the black box of sentiment and cryptocurrency: What, which, why, when and how?," Global Finance Journal, Elsevier, vol. 60(C).
    22. Sine Zambach & Jens Ulrik Hansen, 2023. "Student and teacher performance during COVID-19 lockdown: An investigation of associated features and complex interactions using multiple data sources," PLOS ONE, Public Library of Science, vol. 18(10), pages 1-29, October.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jsusta:v:17:y:2025:i:12:p:5274-:d:1673993. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.