IDEAS home Printed from https://ideas.repec.org/a/plo/pdig00/0000334.html
   My bibliography  Save this article

A population-based study exploring phenotypic clusters and clinical outcomes in stroke using unsupervised machine learning approach

Author

Listed:
  • Ralph K Akyea
  • George Ntaios
  • Evangelos Kontopantelis
  • Georgios Georgiopoulos
  • Daniele Soria
  • Folkert W Asselbergs
  • Joe Kai
  • Stephen F Weng
  • Nadeem Qureshi

Abstract

Individuals developing stroke have varying clinical characteristics, demographic, and biochemical profiles. This heterogeneity in phenotypic characteristics can impact on cardiovascular disease (CVD) morbidity and mortality outcomes. This study uses a novel clustering approach to stratify individuals with incident stroke into phenotypic clusters and evaluates the differential burden of recurrent stroke and other cardiovascular outcomes. We used linked clinical data from primary care, hospitalisations, and death records in the UK. A data-driven clustering analysis (kamila algorithm) was used in 48,114 patients aged ≥ 18 years with incident stroke, from 1-Jan-1998 to 31-Dec-2017 and no prior history of serious vascular events. Cox proportional hazards regression was used to estimate hazard ratios (HRs) for subsequent adverse outcomes, for each of the generated clusters. Adverse outcomes included coronary heart disease (CHD), recurrent stroke, peripheral vascular disease (PVD), heart failure, CVD-related and all-cause mortality. Four distinct phenotypes with varying underlying clinical characteristics were identified in patients with incident stroke. Compared with cluster 1 (n = 5,201, 10.8%), the risk of composite recurrent stroke and CVD-related mortality was higher in the other 3 clusters (cluster 2 [n = 18,655, 38.8%]: hazard ratio [HR], 1.07; 95% CI, 1.02–1.12; cluster 3 [n = 10,244, 21.3%]: HR, 1.20; 95% CI, 1.14–1.26; and cluster 4 [n = 14,014, 29.1%]: HR, 1.44; 95% CI: 1.37–1.50). Similar trends in risk were observed for composite recurrent stroke and all-cause mortality outcome, and subsequent recurrent stroke outcome. However, results were not consistent for subsequent risk in CHD, PVD, heart failure, CVD-related mortality, and all-cause mortality. In this proof of principle study, we demonstrated how a heterogenous population of patients with incident stroke can be stratified into four relatively homogenous phenotypes with differential risk of recurrent and major cardiovascular outcomes. This offers an opportunity to revisit the stratification of care for patients with incident stroke to improve patient outcomes.Author summary: Using an unsupervised machine learning cluster analysis approach, adult patients with incident stroke were grouped into four clinically meaningful phenotypic clusters based on their demographic, biochemical, comorbidities, and prescribed medication profiles at the time of incident stroke. The findings of this study highlight the significant heterogeneity that exists within patients with incident stroke with respect to subsequent cardiovascular morbidity and mortality outcomes. This offers an opportunity to revisit the stratification of care for patients with incident stroke to improve patient outcomes and highlights the potential to target modifiable characteristics in clusters for more targeted preventive intervention.

Suggested Citation

  • Ralph K Akyea & George Ntaios & Evangelos Kontopantelis & Georgios Georgiopoulos & Daniele Soria & Folkert W Asselbergs & Joe Kai & Stephen F Weng & Nadeem Qureshi, 2023. "A population-based study exploring phenotypic clusters and clinical outcomes in stroke using unsupervised machine learning approach," PLOS Digital Health, Public Library of Science, vol. 2(9), pages 1-16, September.
  • Handle: RePEc:plo:pdig00:0000334
    DOI: 10.1371/journal.pdig.0000334
    as

    Download full text from publisher

    File URL: https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000334
    Download Restriction: no

    File URL: https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000334&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pdig.0000334?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pdig00:0000334. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: digitalhealth (email available below). General contact details of provider: https://journals.plos.org/digitalhealth .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.