Author
Listed:
- Yingjie Kuang
- Jun Zhang
- Zhen An
- Chunxu Yang
- Wenxu Guo
- Xiaomin Liu
- Yue Zhang
Abstract
Background: Isolated distal deep vein thrombosis (IDDVT) is common, yet tools for predicting poor recanalization remain limited. We aimed to develop and compare machine learning models for predicting poor recanalization in patients with IDDVT and to identify the most informative predictors. Methods: A total of 1600 patients with IDDVT were retrospectively enrolled. The dataset was randomly divided into a development set (n = 1280) and an independent test set (n = 320) using stratified sampling. Six predictive models were developed and compared: logistic regression (LR), support vector machine (SVM), random forest (RF), multilayer perceptron (MLP), extreme gradient boosting (XGBoost), and a Voting Ensemble. Model training and hyperparameter tuning were performed in the development set using five-fold stratified cross-validation, and optimal classification thresholds were determined using the Youden index. Model performance was evaluated by discrimination, calibration, and classification metrics, with 95% confidence intervals estimated by bootstrap resampling (10,000 iterations). SHAP analysis was applied to interpret the final model. Results: In the independent test set, all models showed acceptable to strong discrimination, with AUC values ranging from 0.808 to 0.908. XGBoost achieved the best overall performance, with an optimal threshold of 0.183, an AUC of 0.908 (95% CI, 0.855–0.952), a Brier score of 0.077 (95% CI, 0.058–0.096), an accuracy of 0.900 (95% CI, 0.866–0.931), a precision of 0.650 (95% CI, 0.529–0.767), a recall of 0.803 (95% CI, 0.686–0.906), an F1-score of 0.717 (95% CI, 0.615–0.806), and a specificity of 0.918 (95% CI, 0.884–0.950). The calibration intercept and slope of the XGBoost model were 0.149 (95% CI, −0.192 to 0.454) and 1.410 (95% CI, 1.098–1.809), respectively, indicating acceptable overall calibration. SHAP analysis identified D-dimer rate, provoking-factor-related variables, anticoagulant use, and age group as the most influential predictors. Conclusion: Among six candidate models, XGBoost showed the best overall performance for predicting poor recanalization in patients with IDDVT. This study establishes an interpretable machine learning-based prediction framework focused specifically on poor recanalization in IDDVT and highlights the contribution of dynamic laboratory information, particularly D-dimer rate. The model may support early risk stratification and individualized follow-up planning, but external validation is required before routine clinical implementation.
Suggested Citation
Yingjie Kuang & Jun Zhang & Zhen An & Chunxu Yang & Wenxu Guo & Xiaomin Liu & Yue Zhang, 2026.
"Machine learning-based model for predicting recanalization in isolated distal deep vein thrombosis and analysis of predictors,"
PLOS ONE, Public Library of Science, vol. 21(5), pages 1-14, May.
Handle:
RePEc:plo:pone00:0349110
DOI: 10.1371/journal.pone.0349110
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0349110. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.