IDEAS home Printed from https://ideas.repec.org/a/plo/pdig00/0000334.html
   My bibliography  Save this article

A population-based study exploring phenotypic clusters and clinical outcomes in stroke using unsupervised machine learning approach

Author

Listed:
  • Ralph K Akyea
  • George Ntaios
  • Evangelos Kontopantelis
  • Georgios Georgiopoulos
  • Daniele Soria
  • Folkert W Asselbergs
  • Joe Kai
  • Stephen F Weng
  • Nadeem Qureshi

Abstract

Individuals developing stroke have varying clinical characteristics, demographic, and biochemical profiles. This heterogeneity in phenotypic characteristics can impact on cardiovascular disease (CVD) morbidity and mortality outcomes. This study uses a novel clustering approach to stratify individuals with incident stroke into phenotypic clusters and evaluates the differential burden of recurrent stroke and other cardiovascular outcomes. We used linked clinical data from primary care, hospitalisations, and death records in the UK. A data-driven clustering analysis (kamila algorithm) was used in 48,114 patients aged ≥ 18 years with incident stroke, from 1-Jan-1998 to 31-Dec-2017 and no prior history of serious vascular events. Cox proportional hazards regression was used to estimate hazard ratios (HRs) for subsequent adverse outcomes, for each of the generated clusters. Adverse outcomes included coronary heart disease (CHD), recurrent stroke, peripheral vascular disease (PVD), heart failure, CVD-related and all-cause mortality. Four distinct phenotypes with varying underlying clinical characteristics were identified in patients with incident stroke. Compared with cluster 1 (n = 5,201, 10.8%), the risk of composite recurrent stroke and CVD-related mortality was higher in the other 3 clusters (cluster 2 [n = 18,655, 38.8%]: hazard ratio [HR], 1.07; 95% CI, 1.02–1.12; cluster 3 [n = 10,244, 21.3%]: HR, 1.20; 95% CI, 1.14–1.26; and cluster 4 [n = 14,014, 29.1%]: HR, 1.44; 95% CI: 1.37–1.50). Similar trends in risk were observed for composite recurrent stroke and all-cause mortality outcome, and subsequent recurrent stroke outcome. However, results were not consistent for subsequent risk in CHD, PVD, heart failure, CVD-related mortality, and all-cause mortality. In this proof of principle study, we demonstrated how a heterogenous population of patients with incident stroke can be stratified into four relatively homogenous phenotypes with differential risk of recurrent and major cardiovascular outcomes. This offers an opportunity to revisit the stratification of care for patients with incident stroke to improve patient outcomes.Author summary: Using an unsupervised machine learning cluster analysis approach, adult patients with incident stroke were grouped into four clinically meaningful phenotypic clusters based on their demographic, biochemical, comorbidities, and prescribed medication profiles at the time of incident stroke. The findings of this study highlight the significant heterogeneity that exists within patients with incident stroke with respect to subsequent cardiovascular morbidity and mortality outcomes. This offers an opportunity to revisit the stratification of care for patients with incident stroke to improve patient outcomes and highlights the potential to target modifiable characteristics in clusters for more targeted preventive intervention.

Suggested Citation

  • Ralph K Akyea & George Ntaios & Evangelos Kontopantelis & Georgios Georgiopoulos & Daniele Soria & Folkert W Asselbergs & Joe Kai & Stephen F Weng & Nadeem Qureshi, 2023. "A population-based study exploring phenotypic clusters and clinical outcomes in stroke using unsupervised machine learning approach," PLOS Digital Health, Public Library of Science, vol. 2(9), pages 1-16, September.
  • Handle: RePEc:plo:pdig00:0000334
    DOI: 10.1371/journal.pdig.0000334
    as

    Download full text from publisher

    File URL: https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000334
    Download Restriction: no

    File URL: https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000334&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pdig.0000334?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Patrick Royston, 2005. "Multiple imputation of missing values: update," Stata Journal, StataCorp LLC, vol. 5(2), pages 188-201, June.
    2. Kursa, Miron B. & Rudnicki, Witold R., 2010. "Feature Selection with the Boruta Package," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 36(i11).
    3. Patrick Royston, 2005. "Multiple imputation of missing values: Update of ice," Stata Journal, StataCorp LLC, vol. 5(4), pages 527-536, December.
    4. Patrick Royston, 2005. "MICE for multiple imputation of missing values," United Kingdom Stata Users' Group Meetings 2005 02, Stata Users Group.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lee, Chioun & Ryff, Carol D., 2016. "Early parenthood as a link between childhood disadvantage and adult heart problems: A gender-based approach," Social Science & Medicine, Elsevier, vol. 171(C), pages 58-66.
    2. Denney, Justin T. & Brewer, Mackenzie & Kimbro, Rachel Tolbert, 2020. "Food insecurity in households with young children: A test of contextual congruence," Social Science & Medicine, Elsevier, vol. 263(C).
    3. Gerko Vink & Laurence E. Frank & Jeroen Pannekoek & Stef Buuren, 2014. "Predictive mean matching imputation of semicontinuous variables," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 68(1), pages 61-90, February.
    4. Watkins, Adam M. & Melde, Chris, 2018. "Gangs, gender, and involvement in crime, victimization, and exposure to violence," Journal of Criminal Justice, Elsevier, vol. 57(C), pages 11-25.
    5. HwaJung Choi & Robert F. Schoeni & Kenneth M. Langa & Michele M. Heisler, 2015. "Spouse and Child Availability for Newly Disabled Older Adults: Socioeconomic Differences and Potential Role of Residential Proximity," The Journals of Gerontology: Series B, The Gerontological Society of America, vol. 70(3), pages 462-469.
    6. Simon Grund & Oliver Lüdtke & Alexander Robitzsch, 2018. "Multiple Imputation of Missing Data at Level 2: A Comparison of Fully Conditional and Joint Modeling in Multilevel Designs," Journal of Educational and Behavioral Statistics, , vol. 43(3), pages 316-353, June.
    7. Jason R. D. Rarick & Carly Tubbs Dolan & Wen‐Jui Han & Jun Wen, 2018. "Relations Between Socioeconomic Status, Subjective Social Status, and Health in Shanghai, China," Social Science Quarterly, Southwestern Social Science Association, vol. 99(1), pages 390-405, March.
    8. David W Lawson & Arijeta Makoli & Anna Goodman, 2013. "Sibling Configuration Predicts Individual and Descendant Socioeconomic Success in a Modern Post-Industrial Society," PLOS ONE, Public Library of Science, vol. 8(9), pages 1-9, September.
    9. HwaJung Choi & Robert F. Schoeni & Kenneth M. Langa & Michele M. Heisler, 2015. "Older Adults’ Residential Proximity to Their Children: Changes After Cardiovascular Events," The Journals of Gerontology: Series B, The Gerontological Society of America, vol. 70(6), pages 995-1004.
    10. Brewer, Mackenzie & Kimbro, Rachel Tolbert, 2014. "Neighborhood context and immigrant children's physical activity," Social Science & Medicine, Elsevier, vol. 116(C), pages 1-9.
    11. Lee, RaeHyuck & Brooks-Gunn, Jeanne & Han, Wen-Jui & Waldfogel, Jane & Zhai, Fuhua, 2014. "Is participation in Head Start associated with less maternal spanking for boys and girls?," Children and Youth Services Review, Elsevier, vol. 46(C), pages 55-63.
    12. Lombardi, Caitlin McPherran, 2021. "Family income and mothers’ parenting quality: Within-family associations from infancy to late childhood," Children and Youth Services Review, Elsevier, vol. 120(C).
    13. Minda Tan & Shuiyun Liu, 2023. "A Way of Human Capital Accumulation: Heterogeneous Impact of Shadow Education on Students’ Academic Performance in China," SAGE Open, , vol. 13(4), pages 21582440231, November.
    14. Angela Cipollone & Carlo D'Ippoliti, 2009. "Women's Employment: Beyond Individual Characteristics vs. Contextual Factors Explanations," Working Papers CELEG 0901, Dipartimento di Economia e Finanza, LUISS Guido Carli.
    15. King, Christian, 2018. "Food insecurity and child behavior problems in fragile families," Economics & Human Biology, Elsevier, vol. 28(C), pages 14-22.
    16. Emma Zang & Anthony R. Bardo, 2019. "Objective and Subjective Socioeconomic Status, Their Discrepancy, and Health: Evidence from East Asia," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 143(2), pages 765-794, June.
    17. Álvaro Choi & María Gil & Mauro Mediavilla & Javier Valbuena, 2016. "Double toil and trouble: grade retention and academic performance," Working Papers 2016/7, Institut d'Economia de Barcelona (IEB).
    18. Oya Kalaycioglu & Andrew Copas & Michael King & Rumana Z. Omar, 2016. "A comparison of multiple-imputation methods for handling missing data in repeated measurements observational studies," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 179(3), pages 683-706, June.
    19. Shaun R. Seaman & Ian R. White & Andrew J. Copas & Leah Li, 2012. "Combining Multiple Imputation and Inverse-Probability Weighting," Biometrics, The International Biometric Society, vol. 68(1), pages 129-137, March.
    20. Asit Bhattacharyya & Lorne Cummings, 2015. "Measuring Corporate Environmental Performance – Stakeholder Engagement Evaluation," Business Strategy and the Environment, Wiley Blackwell, vol. 24(5), pages 309-325, July.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pdig00:0000334. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: digitalhealth (email available below). General contact details of provider: https://journals.plos.org/digitalhealth .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.