IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0312914.html
   My bibliography  Save this article

Enhancing stroke disease classification through machine learning models via a novel voting system by feature selection techniques

Author

Listed:
  • Mahade Hasan
  • Farhana Yasmin
  • Md Mehedi Hassan
  • Xue Yu
  • Soniya Yeasmin
  • Herat Joshi
  • Sheikh Mohammed Shariful Islam

Abstract

Heart disease remains a leading cause of mortality and morbidity worldwide, necessitating the development of accurate and reliable predictive models to facilitate early detection and intervention. While state of the art work has focused on various machine learning approaches for predicting heart disease, but they could not able to achieve remarkable accuracy. In response to this need, we applied nine machine learning algorithms XGBoost, logistic regression, decision tree, random forest, k-nearest neighbors (KNN), support vector machine (SVM), gaussian naïve bayes (NB gaussian), adaptive boosting, and linear regression to predict heart disease based on a range of physiological indicators. Our approach involved feature selection techniques to identify the most relevant predictors, aimed at refining the models to enhance both performance and interpretability. The models were trained, incorporating processes such as grid search hyperparameter tuning, and cross-validation to minimize overfitting. Additionally, we have developed a novel voting system with feature selection techniques to advance heart disease classification. Furthermore, we have evaluated the models using key performance metrics including accuracy, precision, recall, F1-score, and the area under the receiver operating characteristic curve (ROC AUC). Among the models, XGBoost demonstrated exceptional performance, achieving 99% accuracy, precision, F1-Score, 98% recall, and 100% ROC AUC. This study offers a promising approach to early heart disease diagnosis and preventive healthcare.

Suggested Citation

  • Mahade Hasan & Farhana Yasmin & Md Mehedi Hassan & Xue Yu & Soniya Yeasmin & Herat Joshi & Sheikh Mohammed Shariful Islam, 2025. "Enhancing stroke disease classification through machine learning models via a novel voting system by feature selection techniques," PLOS ONE, Public Library of Science, vol. 20(1), pages 1-34, January.
  • Handle: RePEc:plo:pone00:0312914
    DOI: 10.1371/journal.pone.0312914
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0312914
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0312914&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0312914?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Stephen F Weng & Jenna Reps & Joe Kai & Jonathan M Garibaldi & Nadeem Qureshi, 2017. "Can machine-learning improve cardiovascular risk prediction using routine clinical data?," PLOS ONE, Public Library of Science, vol. 12(4), pages 1-14, April.
    2. Nyitrai, Tamás & Virág, Miklós, 2019. "The effects of handling outliers on the performance of bankruptcy prediction models," Socio-Economic Planning Sciences, Elsevier, vol. 67(C), pages 34-42.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Salvatore Tedesco & Martina Andrulli & Markus Åkerlund Larsson & Daniel Kelly & Antti Alamäki & Suzanne Timmons & John Barton & Joan Condell & Brendan O’Flynn & Anna Nordström, 2021. "Comparison of Machine Learning Techniques for Mortality Prediction in a Prospective Cohort of Older Adults," IJERPH, MDPI, vol. 18(23), pages 1-18, December.
    2. Lidiya Guryanova & Olena Bolotova & Vitalii Gvozdytskyi & Sergienko Olena, 2020. "Long-term financial sustainability: An evaluation methodology with threats considerations," RIVISTA DI STUDI SULLA SOSTENIBILITA', FrancoAngeli Editore, vol. 0(1), pages 47-69.
    3. Beata Gavurova & Sylvia Jencova & Radovan Bacik & Marta Miskufova & Stanislav Letkovsky, 2022. "Artificial intelligence in predicting the bankruptcy of non-financial corporations," Oeconomia Copernicana, Institute of Economic Research, vol. 13(4), pages 1215-1251, December.
    4. N Salet & A Gökdemir & J Preijde & C H van Heck & F Eijkenaar, 2024. "Using machine learning to predict acute myocardial infarction and ischemic heart disease in primary care cardiovascular patients," PLOS ONE, Public Library of Science, vol. 19(7), pages 1-17, July.
    5. Elena Gregova & Katarina Valaskova & Peter Adamko & Milos Tumpach & Jaroslav Jaros, 2020. "Predicting Financial Distress of Slovak Enterprises: Comparison of Selected Traditional and Learning Algorithms Methods," Sustainability, MDPI, vol. 12(10), pages 1-17, May.
    6. Ying Wang & Zhicheng Du & Wayne R. Lawrence & Yun Huang & Yu Deng & Yuantao Hao, 2019. "Predicting Hepatitis B Virus Infection Based on Health Examination Data of Community Population," IJERPH, MDPI, vol. 16(23), pages 1-13, December.
    7. Li, Liping & Chen, Qisheng & Li, Jing & Chen, Jin & Jia, Ximeng, 2024. "The impact of board capital on open innovation with the moderating effect of executive equity incentives," Research in International Business and Finance, Elsevier, vol. 72(PA).
    8. Shelda Sajeev & Stephanie Champion & Alline Beleigoli & Derek Chew & Richard L. Reed & Dianna J. Magliano & Jonathan E. Shaw & Roger L. Milne & Sarah Appleton & Tiffany K. Gill & Anthony Maeder, 2021. "Predicting Australian Adults at High Risk of Cardiovascular Disease Mortality Using Standard Risk Factors and Machine Learning," IJERPH, MDPI, vol. 18(6), pages 1-14, March.
    9. José G Fuentes Cabrera & Hugo A Pérez Vicente & Sebastián Maldonado & Jonás Velasco, 2023. "Combination of unsupervised discretization methods for credit risk," PLOS ONE, Public Library of Science, vol. 18(11), pages 1-18, November.
    10. Woo Suk Hong & Adrian Daniel Haimovich & R Andrew Taylor, 2018. "Predicting hospital admission at emergency department triage using machine learning," PLOS ONE, Public Library of Science, vol. 13(7), pages 1-13, July.
    11. Nidadavolu Venkat Durga Sai Siva Vara Prasad Raju & Penmetsa Naveena Devi, 2024. "AI-Assisted Medical Imaging and Heart Disease Diagnosis: A Deep Learning Approach for Automated Analysis and Enhanced Prediction Using Ensemble Classifiers," Journal of Artificial Intelligence General science (JAIGS) ISSN:3006-4023, Open Knowledge, vol. 6(1), pages 210-229.
    12. Dong, Weijia & Dong, Xinyang & Lv, Xin, 2022. "How does ownership structure affect corporate environmental responsibility? Evidence from the manufacturing sector in China," Energy Economics, Elsevier, vol. 112(C).
    13. Hoa Thi Nguyen & Claudia M. Denkinger & Stephan Brenner & Lisa Koeppel & Lucia Brugnara & Robin Burk & Michael Knop & Till Bärnighausen & Andreas Deckert & Manuela De Allegri, 2023. "Cost and cost-effectiveness of four different SARS-CoV-2 active surveillance strategies: evidence from a randomised control trial in Germany," The European Journal of Health Economics, Springer;Deutsche Gesellschaft für Gesundheitsökonomie (DGGÖ), vol. 24(9), pages 1545-1559, December.
    14. Mário S. Céu & Raquel M. Gaspar, 2023. "Financial Distress in European Vineyards and Olive Groves," Working Papers REM 2023/0266, ISEG - Lisbon School of Economics and Management, REM, Universidade de Lisboa.
    15. Sharan Srinivas, 2020. "A Machine Learning-Based Approach for Predicting Patient Punctuality in Ambulatory Care Centers," IJERPH, MDPI, vol. 17(10), pages 1-15, May.
    16. Syed Waseem Abbas Sherazi & Jang-Whan Bae & Jong Yun Lee, 2021. "A soft voting ensemble classifier for early prediction and diagnosis of occurrences of major adverse cardiovascular events for STEMI and NSTEMI during 2-year follow-up in patients with acute coronary ," PLOS ONE, Public Library of Science, vol. 16(6), pages 1-20, June.
    17. Michal Pavlicko & Marek Durica & Jaroslav Mazanec, 2021. "Ensemble Model of the Financial Distress Prediction in Visegrad Group Countries," Mathematics, MDPI, vol. 9(16), pages 1-26, August.
    18. Alexander Engels & Katrin C Reber & Ivonne Lindlbauer & Kilian Rapp & Gisela Büchele & Jochen Klenk & Andreas Meid & Clemens Becker & Hans-Helmut König, 2020. "Osteoporotic hip fracture prediction from risk factors available in administrative claims data – A machine learning approach," PLOS ONE, Public Library of Science, vol. 15(5), pages 1-14, May.
    19. Pablo Gonzalez Ginestet & Ales Kotalik & David M. Vock & Julian Wolfson & Erin E. Gabriel, 2021. "Stacked inverse probability of censoring weighted bagging: A case study in the InfCareHIV Register," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 70(1), pages 51-65, January.
    20. Marek Vochozka & Jaromir Vrbka & Petr Suler, 2020. "Bankruptcy or Success? The Effective Prediction of a Company’s Financial Development Using LSTM," Sustainability, MDPI, vol. 12(18), pages 1-17, September.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0312914. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.