Author
Abstract
Road traffic crashes pose a serious public safety challenge, particularly due to fatal and serious injuries. Although machine learning (ML) has been widely used for crash severity prediction, many studies remain accuracy-oriented and insufficiently address class imbalance, decision thresholds, and probabilistic reliability. This study proposes a safety-oriented and explainable ML framework for predicting killed or seriously injured (KSI) crashes using nationwide United Kingdom traffic accident data from 2020–2024. Crash severity is reformulated as a binary classification task distinguishing slight injury crashes from KSI outcomes, aligning model objectives with road safety priorities. A Light Gradient Boosting Machine (LightGBM) model is developed with imbalance handling using SMOTE, safety-oriented decision threshold optimization, and probability calibration. Model performance is evaluated using ROC–AUC, precision–recall analysis, confusion matrices, the Brier score, and a utility-based evaluation metric, while interpretability is ensured through SHapley Additive exPlanations (SHAP). Results show that default threshold settings fail to adequately detect severe crashes. At an optimized threshold of 0.35, the model achieves a Recall(KSI) of 0.605 – representing a substantial 73% improvement compared to conventional configurations – while maintaining acceptable precision. In addition, probability calibration confirms reliable risk estimation (Brier score = 0.190), supporting risk-based interpretation. Comparative analysis demonstrates that the SMOTE-based model provides a more balanced and operationally effective trade-off than class-weighted learning. SHAP analysis identifies speed limit, road class, lighting conditions, and urban context as key variables associated with KSI risk. The findings highlight the importance of safety-oriented learning design and context-aware performance interpretation for effective, risk-based traffic safety management.
Suggested Citation
Khanh Giang Le, 2026.
"Safety-oriented and explainable machine learning for KSI crash risk prediction: Evidence from the United Kingdom,"
PLOS ONE, Public Library of Science, vol. 21(4), pages 1-21, April.
Handle:
RePEc:plo:pone00:0347873
DOI: 10.1371/journal.pone.0347873
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0347873. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.