IDEAS home Printed from https://ideas.repec.org/a/gam/jsusta/v15y2023i13p9878-d1176070.html
   My bibliography  Save this article

Comparative Analysis of Parametric and Non-Parametric Data-Driven Models to Predict Road Crash Severity among Elderly Drivers Using Synthetic Resampling Techniques

Author

Listed:
  • Mubarak Alrumaidhi

    (Center for Sustainable Mobility, Virginia Tech Transportation Institute, Blacksburg, VA 24061, USA
    Civil Engineering Department, College of Technological Studies, Public Authority for Applied Education and Training, Shuwaikh 70654, Kuwait)

  • Mohamed M. G. Farag

    (Center for Sustainable Mobility, Virginia Tech Transportation Institute, Blacksburg, VA 24061, USA
    College of Computing and Information Technology, Arab Academy for Science, Technology, and Maritime Transport, Alexandria 1029, Egypt)

  • Hesham A. Rakha

    (Center for Sustainable Mobility, Virginia Tech Transportation Institute, Blacksburg, VA 24061, USA
    Charles E. Via, Jr. Department of Civil and Environmental Engineering, Virginia Tech, Blacksburg, VA 24061, USA)

Abstract

As the global elderly population continues to rise, the risk of severe crashes among elderly drivers has become a pressing concern. This study presents a comprehensive examination of crash severity among this demographic, employing machine learning models and data gathered from Virginia, United States of America, between 2014 and 2021. The analysis integrates parametric models, namely logistic regression and linear discriminant analysis (LDA), as well as non-parametric models like random forest (RF) and extreme gradient boosting (XGBoost). Central to this study is the application of resampling techniques, specifically, random over-sampling examples (ROSE) and the synthetic minority over-sampling technique (SMOTE), to address the dataset’s inherent imbalance and enhance the models’ predictive performance. Our findings reveal that the inclusion of these resampling techniques significantly improves the predictive power of parametric models, notably increasing the true positive rate for severe crash prediction from 6% to 60% and boosting the geometric mean from 25% to 69% in logistic regression. Likewise, employing SMOTE resulted in a notable improvement in the non-parametric models’ performance, leading to a true positive rate increase from 8% to 36% in XGBoost. Moreover, the study established the superiority of parametric models over non-parametric counterparts when balanced resampling techniques are utilized. Beyond predictive modeling, the study delves into the effects of various contributing factors on crash severity, enhancing the understanding of how these factors influence elderly road safety. Ultimately, these findings underscore the immense potential of machine learning models in analyzing complex crash data, pinpointing factors that heighten crash severity, and informing targeted interventions to mitigate the risks of elderly driving.

Suggested Citation

  • Mubarak Alrumaidhi & Mohamed M. G. Farag & Hesham A. Rakha, 2023. "Comparative Analysis of Parametric and Non-Parametric Data-Driven Models to Predict Road Crash Severity among Elderly Drivers Using Synthetic Resampling Techniques," Sustainability, MDPI, vol. 15(13), pages 1-30, June.
  • Handle: RePEc:gam:jsusta:v:15:y:2023:i:13:p:9878-:d:1176070
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2071-1050/15/13/9878/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2071-1050/15/13/9878/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Fahad M. Almasoudi, 2023. "Enhancing Power Grid Resilience through Real-Time Fault Detection and Remediation Using Advanced Hybrid Machine Learning Models," Sustainability, MDPI, vol. 15(10), pages 1-21, May.
    2. Seunghoon Kim & Youngbin Lym & Ki-Jung Kim, 2021. "Developing Crash Severity Model Handling Class Imbalance and Implementing Ordered Nature: Focusing on Elderly Drivers," IJERPH, MDPI, vol. 18(4), pages 1-23, February.
    3. Neda Abdelhamid & Arun Padmavathy & David Peebles & Fadi Thabtah & Daymond Goulder-Horobin, 2020. "Data Imbalance in Autism Pre-Diagnosis Classification Systems: An Experimental Study," Journal of Information & Knowledge Management (JIKM), World Scientific Publishing Co. Pte. Ltd., vol. 19(01), pages 1-16, March.
    4. Mubarak Alrumaidhi & Hesham A. Rakha, 2022. "Factors Affecting Crash Severity among Elderly Drivers: A Multilevel Ordinal Logistic Regression Approach," Sustainability, MDPI, vol. 14(18), pages 1-12, September.
    5. Manze Guo & Zhenzhou Yuan & Bruce Janson & Yongxin Peng & Yang Yang & Wencheng Wang, 2021. "Older Pedestrian Traffic Crashes Severity Analysis Based on an Emerging Machine Learning XGBoost," Sustainability, MDPI, vol. 13(2), pages 1-26, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Abdulaziz H Alshehri & Fayez Alanazi & Ahmed M Yosri & Muhammad Yasir, 2024. "Comparing fatal crash risk factors by age and crash type by using machine learning techniques," PLOS ONE, Public Library of Science, vol. 19(5), pages 1-22, May.
    2. Jamal Almatawah & Mubarak Alrumaidhi & Hamad Matar & Abdulsalam Altemeemi & Jamal Alhubail, 2025. "An Interpretable Machine Learning Framework for Urban Traffic Noise Prediction in Kuwait: A Data-Driven Approach to Environmental Management," Sustainability, MDPI, vol. 17(19), pages 1-18, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xiangning Dong & Xuhao Zhu & Minghua Hu & Jie Bao, 2023. "A Methodology for Predicting Ground Delay Program Incidence through Machine Learning," Sustainability, MDPI, vol. 15(8), pages 1-19, April.
    2. Ulaa AlHaddad & Abdullah Basuhail & Maher Khemakhem & Fathy Elbouraey Eassa & Kamal Jambi, 2023. "Towards Sustainable Energy Grids: A Machine Learning-Based Ensemble Methods Approach for Outages Estimation in Extreme Weather Events," Sustainability, MDPI, vol. 15(16), pages 1-19, August.
    3. Jamal Almatawah & Mubarak Alrumaidhi & Hamad Matar & Abdulsalam Altemeemi & Jamal Alhubail, 2025. "An Interpretable Machine Learning Framework for Urban Traffic Noise Prediction in Kuwait: A Data-Driven Approach to Environmental Management," Sustainability, MDPI, vol. 17(19), pages 1-18, October.
    4. Michael Lechner & Gabriel Okasa, 2025. "Random Forest estimation of the ordered choice model," Empirical Economics, Springer, vol. 68(1), pages 1-106, January.
    5. Dong-Her Shih & Feng-I Chung & Ting-Wei Wu & Shuo-Yu Huang & Ming-Hung Shih, 2024. "Advanced Trans-EEGNet Deep Learning Model for Hypoxic-Ischemic Encephalopathy Severity Grading," Mathematics, MDPI, vol. 12(24), pages 1-27, December.
    6. Kamil Prokop & Andrzej Bień & Szymon Barczentewicz, 2023. "Compression Techniques for Real-Time Control and Non-Time-Critical Big Data in Smart Grids: A Review," Energies, MDPI, vol. 16(24), pages 1-26, December.
    7. Ekram Al Mahdouri & Said Al-Abri & Hassan Yousef & Ibrahim Al-Naimi & Hussein Obeid, 2025. "Physics-Informed Neural Networks in Grid-Connected Inverters: A Review," Energies, MDPI, vol. 18(20), pages 1-19, October.
    8. Rahman, Md Jahidur & Zhu, Hongtao, 2024. "Detecting accounting fraud in family firms: Evidence from machine learning approaches," Advances in accounting, Elsevier, vol. 64(C).
    9. Sung-Mook Oh & Jin Park & Jinsun Yang & Young-Gyun Oh & Kyung-Woo Yi, 2023. "Smart classification method to detect irregular nozzle spray patterns inside carbon black reactor using ensemble transfer learning," Journal of Intelligent Manufacturing, Springer, vol. 34(6), pages 2729-2745, August.
    10. Piotr Szagała & Piotr Olszewski & Witold Czajewski & Paweł Dąbkowski, 2021. "Active Signage of Pedestrian Crossings as a Tool in Road Safety Management," Sustainability, MDPI, vol. 13(16), pages 1-13, August.
    11. Lin, Fengming & Fang, Shu-Cherng & Fang, Xiaolei & Gao, Zheming & Luo, Jian, 2024. "A distributionally robust chance-constrained kernel-free quadratic surface support vector machine," European Journal of Operational Research, Elsevier, vol. 316(1), pages 46-60.
    12. Fayaz Hassan & Zafi Sherhan Syed & Aftab Ahmed Memon & Saad Said Alqahtany & Nadeem Ahmed & Mana Saleh Al Reshan & Yousef Asiri & Asadullah Shaikh, 2025. "A hybrid approach for intrusion detection in vehicular networks using feature selection and dimensionality reduction with optimized deep learning," PLOS ONE, Public Library of Science, vol. 20(2), pages 1-18, February.
    13. Janis Ivanovs & Andreas Haberl & Raitis Melniks, 2024. "Modeling Geospatial Distribution of Peat Layer Thickness Using Machine Learning and Aerial Laser Scanning Data," Land, MDPI, vol. 13(4), pages 1-14, April.
    14. Yang Hui & Xuesong Mei & Gedong Jiang & Fei Zhao & Ziwei Ma & Tao Tao, 2022. "Assembly quality evaluation for linear axis of machine tool using data-driven modeling approach," Journal of Intelligent Manufacturing, Springer, vol. 33(3), pages 753-769, March.
    15. Xinchun Zhu & Yang Wu & Xu Zhao & Yunchen Yang & Shuangquan Liu & Luyi Shi & Yelong Wu, 2024. "Overview of Wind and Photovoltaic Data Stream Classification and Data Drift Issues," Energies, MDPI, vol. 17(17), pages 1-24, September.
    16. Yoon, Junho, 2025. "Prediction of high-risk areas using the interpretable machine learning: Based on each determinant for the severity of pedestrian crashes," Journal of Transport Geography, Elsevier, vol. 126(C).
    17. Lei Yang & Mahdi Aghaabbasi & Mujahid Ali & Amin Jan & Belgacem Bouallegue & Muhammad Faisal Javed & Nermin M. Salem, 2022. "Comparative Analysis of the Optimized KNN, SVM, and Ensemble DT Models Using Bayesian Optimization for Predicting Pedestrian Fatalities: An Advance towards Realizing the Sustainable Safety of Pedestri," Sustainability, MDPI, vol. 14(17), pages 1-18, August.
    18. Juan Liu & Sha Mi, 2023. "American literature news narration based on computer web technology," PLOS ONE, Public Library of Science, vol. 18(10), pages 1-17, October.
    19. Weijia (Vivian) Li & Kara M. Kockelman, 2022. "How does machine learning compare to conventional econometrics for transport data sets? A test of ML versus MLE," Growth and Change, Wiley Blackwell, vol. 53(1), pages 342-376, March.
    20. Chen, Tianyi & Wang, Hua & Cai, Yutong & Liang, Maohan & Meng, Qiang, 2025. "Exploring key factors for long-term vessel incident risk prediction," Reliability Engineering and System Safety, Elsevier, vol. 253(C).

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jsusta:v:15:y:2023:i:13:p:9878-:d:1176070. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.