Machine learning-based model for predicting recanalization in isolated distal deep vein thrombosis and analysis of predictors

Machine learning-based model for predicting recanalization in isolated distal deep vein thrombosis and analysis of predictors

Author

Listed:

Yingjie Kuang
Jun Zhang
Zhen An
Chunxu Yang
Wenxu Guo
Xiaomin Liu
Yue Zhang

Abstract

Background: Isolated distal deep vein thrombosis (IDDVT) is common, yet tools for predicting poor recanalization remain limited. We aimed to develop and compare machine learning models for predicting poor recanalization in patients with IDDVT and to identify the most informative predictors. Methods: A total of 1600 patients with IDDVT were retrospectively enrolled. The dataset was randomly divided into a development set (n = 1280) and an independent test set (n = 320) using stratified sampling. Six predictive models were developed and compared: logistic regression (LR), support vector machine (SVM), random forest (RF), multilayer perceptron (MLP), extreme gradient boosting (XGBoost), and a Voting Ensemble. Model training and hyperparameter tuning were performed in the development set using five-fold stratified cross-validation, and optimal classification thresholds were determined using the Youden index. Model performance was evaluated by discrimination, calibration, and classification metrics, with 95% confidence intervals estimated by bootstrap resampling (10,000 iterations). SHAP analysis was applied to interpret the final model. Results: In the independent test set, all models showed acceptable to strong discrimination, with AUC values ranging from 0.808 to 0.908. XGBoost achieved the best overall performance, with an optimal threshold of 0.183, an AUC of 0.908 (95% CI, 0.855–0.952), a Brier score of 0.077 (95% CI, 0.058–0.096), an accuracy of 0.900 (95% CI, 0.866–0.931), a precision of 0.650 (95% CI, 0.529–0.767), a recall of 0.803 (95% CI, 0.686–0.906), an F1-score of 0.717 (95% CI, 0.615–0.806), and a specificity of 0.918 (95% CI, 0.884–0.950). The calibration intercept and slope of the XGBoost model were 0.149 (95% CI, −0.192 to 0.454) and 1.410 (95% CI, 1.098–1.809), respectively, indicating acceptable overall calibration. SHAP analysis identified D-dimer rate, provoking-factor-related variables, anticoagulant use, and age group as the most influential predictors. Conclusion: Among six candidate models, XGBoost showed the best overall performance for predicting poor recanalization in patients with IDDVT. This study establishes an interpretable machine learning-based prediction framework focused specifically on poor recanalization in IDDVT and highlights the contribution of dynamic laboratory information, particularly D-dimer rate. The model may support early risk stratification and individualized follow-up planning, but external validation is required before routine clinical implementation.

Suggested Citation

Yingjie Kuang & Jun Zhang & Zhen An & Chunxu Yang & Wenxu Guo & Xiaomin Liu & Yue Zhang, 2026. "Machine learning-based model for predicting recanalization in isolated distal deep vein thrombosis and analysis of predictors," PLOS ONE, Public Library of Science, vol. 21(5), pages 1-14, May.

Handle: RePEc:plo:pone00:0349110
DOI: 10.1371/journal.pone.0349110

Download full text from publisher

More about this item

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0349110. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

We have no bibliographic references for this item. You can help adding them by using this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Machine learning-based model for predicting recanalization in isolated distal deep vein thrombosis and analysis of predictors

Author

Abstract

Suggested Citation

Download full text from publisher

More about this item

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data