IDEAS home Printed from https://ideas.repec.org/a/gam/jsusta/v15y2023i14p11408-d1200319.html
   My bibliography  Save this article

Short-Term PM 2.5 Concentration Changes Prediction: A Comparison of Meteorological and Historical Data

Author

Listed:
  • Junfeng Kang

    (School of Civil and Surveying & Mapping Engineering, Jiangxi University of Science and Technology, Ganzhou 341000, China)

  • Xinyi Zou

    (School of Civil and Surveying & Mapping Engineering, Jiangxi University of Science and Technology, Ganzhou 341000, China)

  • Jianlin Tan

    (School of Civil and Surveying & Mapping Engineering, Jiangxi University of Science and Technology, Ganzhou 341000, China)

  • Jun Li

    (Guangdong Science & Technology Infrastructure Center, Guangzhou 510033, China)

  • Hamed Karimian

    (School of Civil and Surveying & Mapping Engineering, Jiangxi University of Science and Technology, Ganzhou 341000, China
    School of Marine Technology and Geomatics, Jiangsu Ocean University, Lianyungang 222005, China)

Abstract

Machine learning is being extensively employed in the prediction of PM 2.5 concentrations. This study aims to compare the prediction accuracy of machine learning models for short-term PM 2.5 concentration changes and to find a universal and robust model for both hourly and daily time scales. Five commonly used machine learning models were constructed, along with a stacking model consisting of Multivariable Linear Regression (MLR) as the meta-learner and the ensemble of Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LightGBM) as the base learner models. The meteorological datasets and historical PM 2.5 concentration data with meteorological datasets were preprocessed and used to evaluate the model’s accuracy and stability across different time scales, including hourly and daily, using the coefficient of determination (R 2 ), Root-Mean-Square Error (RMSE), and Mean Absolute Error (MAE). The results show that historical PM 2.5 concentration data are crucial for the prediction precision of the machine learning models. Specifically, on the meteorological datasets, the stacking model, XGboost, and RF had better performance for hourly prediction, and the stacking model, XGboost and LightGBM had better performance for daily prediction. On the historical PM 2.5 concentration data with meteorological datasets, the stacking model, LightGBM, and XGboost had better performance for hourly and daily datasets. Consequently, the stacking model outperformed individual models, with the XGBoost model being the best individual model to predict the PM 2.5 concentration based on meteorological data, and the LightGBM model being the best individual model to predict the PM 2.5 concentration using historical PM 2.5 data with meteorological datasets.

Suggested Citation

  • Junfeng Kang & Xinyi Zou & Jianlin Tan & Jun Li & Hamed Karimian, 2023. "Short-Term PM 2.5 Concentration Changes Prediction: A Comparison of Meteorological and Historical Data," Sustainability, MDPI, vol. 15(14), pages 1-24, July.
  • Handle: RePEc:gam:jsusta:v:15:y:2023:i:14:p:11408-:d:1200319
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2071-1050/15/14/11408/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2071-1050/15/14/11408/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Sun, Xiaolei & Liu, Mingxi & Sima, Zeqian, 2020. "A novel cryptocurrency price trend forecasting model based on LightGBM," Finance Research Letters, Elsevier, vol. 32(C).
    2. Mengyi Ji & Yuying Jiang & Xiping Han & Luo Liu & Xinliang Xu & Zhi Qiao & Wei Sun, 2020. "Spatiotemporal Relationships between Air Quality and Multiple Meteorological Parameters in 221 Chinese Cities," Complexity, Hindawi, vol. 2020, pages 1-25, June.
    3. Abdullah Kaviani Rad & Redmond R. Shamshiri & Armin Naghipour & Seraj-Odeen Razmi & Mohsen Shariati & Foroogh Golkar & Siva K. Balasundram, 2022. "Machine Learning for Determining Interactions between Air Pollutants and Environmental Parameters in Three Cities of Iran," Sustainability, MDPI, vol. 14(13), pages 1-25, June.
    4. Younoh Kim & James Manley & Vlad Radoias, 2017. "Medium- and long-term consequences of pollution on labor supply: evidence from Indonesia," IZA Journal of Labor Economics, Springer;Forschungsinstitut zur Zukunft der Arbeit GmbH (IZA), vol. 6(1), pages 1-15, December.
    5. Hongbin Dai & Guangqiu Huang & Huibin Zeng & Fan Yang, 2021. "PM 2.5 Concentration Prediction Based on Spatiotemporal Feature Selection Using XGBoost-MSCNN-GA-LSTM," Sustainability, MDPI, vol. 13(21), pages 1-24, November.
    6. Ning Xu & Fan Zhang & Xin Xuan, 2021. "Impacts of Industrial Restructuring and Technological Progress on PM 2.5 Pollution: Evidence from Prefecture-Level Cities in China," IJERPH, MDPI, vol. 18(10), pages 1-17, May.
    7. Dong-jun Liu & Li Li, 2015. "Application Study of Comprehensive Forecasting Model Based on Entropy Weighting Method on Trend of PM 2.5 Concentration in Guangzhou, China," IJERPH, MDPI, vol. 12(6), pages 1-15, June.
    8. Liu, Da & Sun, Kun, 2019. "Random forest solar power forecast based on classification optimization," Energy, Elsevier, vol. 187(C).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hongbin Dai & Guangqiu Huang & Huibin Zeng & Fan Yang, 2021. "PM 2.5 Concentration Prediction Based on Spatiotemporal Feature Selection Using XGBoost-MSCNN-GA-LSTM," Sustainability, MDPI, vol. 13(21), pages 1-24, November.
    2. Yamashiro, Hirochika & Nonaka, Hirofumi, 2021. "Estimation of processing time using machine learning and real factory data for optimization of parallel machine scheduling problem," Operations Research Perspectives, Elsevier, vol. 8(C).
    3. Aurelio F. Bariviera & Ignasi Merediz‐Solà, 2021. "Where Do We Stand In Cryptocurrencies Economic Research? A Survey Based On Hybrid Analysis," Journal of Economic Surveys, Wiley Blackwell, vol. 35(2), pages 377-407, April.
    4. Li, Zhengtao & Hu, Bin, 2018. "Perceived health risk, environmental knowledge, and contingent valuation for improving air quality: New evidence from the Jinchuan mining area in China," Economics & Human Biology, Elsevier, vol. 31(C), pages 54-68.
    5. Alireza Rezazadeh & Yasamin Jafarian & Ali Kord, 2022. "Explainable Ensemble Machine Learning for Breast Cancer Diagnosis Based on Ultrasound Image Texture Features," Forecasting, MDPI, vol. 4(1), pages 1-13, February.
    6. Feng, Qianqian & Sun, Xiaolei & Hao, Jun & Li, Jianping, 2021. "Predictability dynamics of multifactor-influenced installed capacity: A perspective of country clustering," Energy, Elsevier, vol. 214(C).
    7. Mohamed Massaoudi & Ines Chihi & Lilia Sidhom & Mohamed Trabelsi & Shady S. Refaat & Fakhreddine S. Oueslati, 2021. "Enhanced Random Forest Model for Robust Short-Term Photovoltaic Power Forecasting Using Weather Measurements," Energies, MDPI, vol. 14(13), pages 1-20, July.
    8. Sibtain, Muhammad & Li, Xianshan & Saleem, Snoober & Ain, Qurat-ul- & Shi, Qiang & Li, Fei & Saeed, Muhammad & Majeed, Fatima & Shah, Syed Shoaib Ahmed & Saeed, Muhammad Hammad, 2022. "Multifaceted irradiance prediction by exploiting hybrid decomposition-entropy-Spatiotemporal attention based Sequence2Sequence models," Renewable Energy, Elsevier, vol. 196(C), pages 648-682.
    9. Goodell, John W. & Ben Jabeur, Sami & Saâdaoui, Foued & Nasir, Muhammad Ali, 2023. "Explainable artificial intelligence modeling to forecast bitcoin prices," International Review of Financial Analysis, Elsevier, vol. 88(C).
    10. Vaia I. Kontopoulou & Athanasios D. Panagopoulos & Ioannis Kakkos & George K. Matsopoulos, 2023. "A Review of ARIMA vs. Machine Learning Approaches for Time Series Forecasting in Data Driven Networks," Future Internet, MDPI, vol. 15(8), pages 1-31, July.
    11. Wang, Xiaoyang & Sun, Yunlin & Luo, Duo & Peng, Jinqing, 2022. "Comparative study of machine learning approaches for predicting short-term photovoltaic power output based on weather type classification," Energy, Elsevier, vol. 240(C).
    12. Zhou, Yi & Zhou, Nanrun & Gong, Lihua & Jiang, Minlin, 2020. "Prediction of photovoltaic power output based on similar day analysis, genetic algorithm and extreme learning machine," Energy, Elsevier, vol. 204(C).
    13. Hakan Pabuccu & Adrian Barbu, 2023. "Feature Selection with Annealing for Forecasting Financial Time Series," Papers 2303.02223, arXiv.org, revised Feb 2024.
    14. Alipour, Mohammadali & Aghaei, Jamshid & Norouzi, Mohammadali & Niknam, Taher & Hashemi, Sattar & Lehtonen, Matti, 2020. "A novel electrical net-load forecasting model based on deep neural networks and wavelet transform integration," Energy, Elsevier, vol. 205(C).
    15. Cheng, Lilin & Zang, Haixiang & Wei, Zhinong & Zhang, Fengchun & Sun, Guoqiang, 2022. "Evaluation of opaque deep-learning solar power forecast models towards power-grid applications," Renewable Energy, Elsevier, vol. 198(C), pages 960-972.
    16. Prof. Reepu & Prof.Bijesh Dhyani & Ms. Ayushi & Dr. Sudhi Sharma & Dr. Manish Kumar, 2022. "Predictive Modelling Of Select Cryptocurrencies And Identifying The Best Suitable Model - With Reference To Arima And Anns," Annals - Economy Series, Constantin Brancusi University, Faculty of Economics, vol. 6, pages 11-19, December.
    17. Luis Sarmiento, 2020. "Waiting for My Sentence: Air Pollution and the Productivity of Court Rulings," Discussion Papers of DIW Berlin 1878, DIW Berlin, German Institute for Economic Research.
    18. Xu Gong & Keqin Guan & Qiyang Chen, 2022. "The role of textual analysis in oil futures price forecasting based on machine learning approach," Journal of Futures Markets, John Wiley & Sons, Ltd., vol. 42(10), pages 1987-2017, October.
    19. Luis Sarmiento, 2022. "Air pollution and the productivity of high‐skill labor: evidence from court hearings," Scandinavian Journal of Economics, Wiley Blackwell, vol. 124(1), pages 301-332, January.
    20. Andi A. H. Lateko & Hong-Tzer Yang & Chao-Ming Huang, 2022. "Short-Term PV Power Forecasting Using a Regression-Based Ensemble Method," Energies, MDPI, vol. 15(11), pages 1-21, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jsusta:v:15:y:2023:i:14:p:11408-:d:1200319. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.