IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0312914.html
   My bibliography  Save this article

Enhancing stroke disease classification through machine learning models via a novel voting system by feature selection techniques

Author

Listed:
  • Mahade Hasan
  • Farhana Yasmin
  • Md Mehedi Hassan
  • Xue Yu
  • Soniya Yeasmin
  • Herat Joshi
  • Sheikh Mohammed Shariful Islam

Abstract

Heart disease remains a leading cause of mortality and morbidity worldwide, necessitating the development of accurate and reliable predictive models to facilitate early detection and intervention. While state of the art work has focused on various machine learning approaches for predicting heart disease, but they could not able to achieve remarkable accuracy. In response to this need, we applied nine machine learning algorithms XGBoost, logistic regression, decision tree, random forest, k-nearest neighbors (KNN), support vector machine (SVM), gaussian naïve bayes (NB gaussian), adaptive boosting, and linear regression to predict heart disease based on a range of physiological indicators. Our approach involved feature selection techniques to identify the most relevant predictors, aimed at refining the models to enhance both performance and interpretability. The models were trained, incorporating processes such as grid search hyperparameter tuning, and cross-validation to minimize overfitting. Additionally, we have developed a novel voting system with feature selection techniques to advance heart disease classification. Furthermore, we have evaluated the models using key performance metrics including accuracy, precision, recall, F1-score, and the area under the receiver operating characteristic curve (ROC AUC). Among the models, XGBoost demonstrated exceptional performance, achieving 99% accuracy, precision, F1-Score, 98% recall, and 100% ROC AUC. This study offers a promising approach to early heart disease diagnosis and preventive healthcare.

Suggested Citation

  • Mahade Hasan & Farhana Yasmin & Md Mehedi Hassan & Xue Yu & Soniya Yeasmin & Herat Joshi & Sheikh Mohammed Shariful Islam, 2025. "Enhancing stroke disease classification through machine learning models via a novel voting system by feature selection techniques," PLOS ONE, Public Library of Science, vol. 20(1), pages 1-34, January.
  • Handle: RePEc:plo:pone00:0312914
    DOI: 10.1371/journal.pone.0312914
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0312914
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0312914&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0312914?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Stephen F Weng & Jenna Reps & Joe Kai & Jonathan M Garibaldi & Nadeem Qureshi, 2017. "Can machine-learning improve cardiovascular risk prediction using routine clinical data?," PLOS ONE, Public Library of Science, vol. 12(4), pages 1-14, April.
    2. Nyitrai, Tamás & Virág, Miklós, 2019. "The effects of handling outliers on the performance of bankruptcy prediction models," Socio-Economic Planning Sciences, Elsevier, vol. 67(C), pages 34-42.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mirza Rizwan Sajid & Bader A. Almehmadi & Waqas Sami & Mansour K. Alzahrani & Noryanti Muhammad & Christophe Chesneau & Asif Hanif & Arshad Ali Khan & Ahmad Shahbaz, 2021. "Development of Nonlaboratory-Based Risk Prediction Models for Cardiovascular Diseases Using Conventional and Machine Learning Approaches," IJERPH, MDPI, vol. 18(23), pages 1-16, November.
    2. Salvatore Tedesco & Martina Andrulli & Markus Åkerlund Larsson & Daniel Kelly & Antti Alamäki & Suzanne Timmons & John Barton & Joan Condell & Brendan O’Flynn & Anna Nordström, 2021. "Comparison of Machine Learning Techniques for Mortality Prediction in a Prospective Cohort of Older Adults," IJERPH, MDPI, vol. 18(23), pages 1-18, December.
    3. Ajay Dev & Sanjay Kumar Malik, 2021. "Artificial Bee Colony Optimized Deep Neural Network Model for Handling Imbalanced Stroke Data: ABC-DNN for Prediction of Stroke," International Journal of E-Health and Medical Communications (IJEHMC), IGI Global, vol. 12(5), pages 67-83, September.
    4. Lidiya Guryanova & Olena Bolotova & Vitalii Gvozdytskyi & Sergienko Olena, 2020. "Long-term financial sustainability: An evaluation methodology with threats considerations," RIVISTA DI STUDI SULLA SOSTENIBILITA', FrancoAngeli Editore, vol. 0(1), pages 47-69.
    5. Feihan Lu & Yao Zheng & Harrington Cleveland & Chris Burton & David Madigan, 2018. "Bayesian hierarchical vector autoregressive models for patient-level predictive modeling," PLOS ONE, Public Library of Science, vol. 13(12), pages 1-27, December.
    6. Beata Gavurova & Sylvia Jencova & Radovan Bacik & Marta Miskufova & Stanislav Letkovsky, 2022. "Artificial intelligence in predicting the bankruptcy of non-financial corporations," Oeconomia Copernicana, Institute of Economic Research, vol. 13(4), pages 1215-1251, December.
    7. Shinya Suzuki & Takeshi Yamashita & Tsuyoshi Sakama & Takuto Arita & Naoharu Yagi & Takayuki Otsuka & Hiroaki Semba & Hiroto Kano & Shunsuke Matsuno & Yuko Kato & Tokuhisa Uejima & Yuji Oikawa & Minor, 2019. "Comparison of risk models for mortality and cardiovascular events between machine learning and conventional logistic regression analysis," PLOS ONE, Public Library of Science, vol. 14(9), pages 1-14, September.
    8. N Salet & A Gökdemir & J Preijde & C H van Heck & F Eijkenaar, 2024. "Using machine learning to predict acute myocardial infarction and ischemic heart disease in primary care cardiovascular patients," PLOS ONE, Public Library of Science, vol. 19(7), pages 1-17, July.
    9. Elena Gregova & Katarina Valaskova & Peter Adamko & Milos Tumpach & Jaroslav Jaros, 2020. "Predicting Financial Distress of Slovak Enterprises: Comparison of Selected Traditional and Learning Algorithms Methods," Sustainability, MDPI, vol. 12(10), pages 1-17, May.
    10. Ying Wang & Zhicheng Du & Wayne R. Lawrence & Yun Huang & Yu Deng & Yuantao Hao, 2019. "Predicting Hepatitis B Virus Infection Based on Health Examination Data of Community Population," IJERPH, MDPI, vol. 16(23), pages 1-13, December.
    11. Li, Liping & Chen, Qisheng & Li, Jing & Chen, Jin & Jia, Ximeng, 2024. "The impact of board capital on open innovation with the moderating effect of executive equity incentives," Research in International Business and Finance, Elsevier, vol. 72(PA).
    12. Francesco Cappelli & Gianfranco Castronuovo & Salvatore Grimaldi & Vito Telesca, 2024. "Random Forest and Feature Importance Measures for Discriminating the Most Influential Environmental Factors in Predicting Cardiovascular and Respiratory Diseases," IJERPH, MDPI, vol. 21(7), pages 1-21, July.
    13. Vadlamani Ravi & Vadlamani Madhav, 2020. "Optimizing the reliability of a bank with Logistic Regression and Particle Swarm Optimization," Papers 2004.11122, arXiv.org.
    14. Fedorova, Elena & Ledyaeva, Svetlana & Drogovoz, Pavel & Nevredinov, Alexandr, 2022. "Economic policy uncertainty and bankruptcy filings," International Review of Financial Analysis, Elsevier, vol. 82(C).
    15. Shelda Sajeev & Stephanie Champion & Alline Beleigoli & Derek Chew & Richard L. Reed & Dianna J. Magliano & Jonathan E. Shaw & Roger L. Milne & Sarah Appleton & Tiffany K. Gill & Anthony Maeder, 2021. "Predicting Australian Adults at High Risk of Cardiovascular Disease Mortality Using Standard Risk Factors and Machine Learning," IJERPH, MDPI, vol. 18(6), pages 1-14, March.
    16. Emily J MacKay & Michael D Stubna & Corey Chivers & Michael E Draugelis & William J Hanson & Nimesh D Desai & Peter W Groeneveld, 2021. "Application of machine learning approaches to administrative claims data to predict clinical outcomes in medical and surgical patient populations," PLOS ONE, Public Library of Science, vol. 16(6), pages 1-14, June.
    17. José G Fuentes Cabrera & Hugo A Pérez Vicente & Sebastián Maldonado & Jonás Velasco, 2023. "Combination of unsupervised discretization methods for credit risk," PLOS ONE, Public Library of Science, vol. 18(11), pages 1-18, November.
    18. Kristóf, Tamás & Márton, András & Fiáth, Attila, 2023. "Állami energiavállalatok pénzügyi teljesítménye Magyarországon a koronavírus-járvány előtt és alatt [Financial performance of publicly owned energy companies in Hungary before and during the COVID ," Közgazdasági Szemle (Economic Review - monthly of the Hungarian Academy of Sciences), Közgazdasági Szemle Alapítvány (Economic Review Foundation), vol. 0(10), pages 1057-1076.
    19. Woo Suk Hong & Adrian Daniel Haimovich & R Andrew Taylor, 2018. "Predicting hospital admission at emergency department triage using machine learning," PLOS ONE, Public Library of Science, vol. 13(7), pages 1-13, July.
    20. Adrian Richter & Julia Truthmann & Jean-François Chenot & Carsten Oliver Schmidt, 2021. "Predicting Physician Consultations for Low Back Pain Using Claims Data and Population-Based Cohort Data—An Interpretable Machine Learning Approach," IJERPH, MDPI, vol. 18(22), pages 1-14, November.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0312914. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.