IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0264828.html
   My bibliography  Save this article

Reproducible disease phenotyping at scale: Example of coronary artery disease in UK Biobank

Author

Listed:
  • Riyaz S Patel
  • Spiros Denaxas
  • Laurence J Howe
  • Rosalind M Eggo
  • Anoop D Shah
  • Naomi E Allen
  • John Danesh
  • Aroon Hingorani
  • Cathie Sudlow
  • Harry Hemingway

Abstract

Importance: A lack of internationally agreed standards for combining available data sources at scale risks inconsistent disease phenotyping limiting research reproducibility. Objective: To develop and then evaluate if a rules-based algorithm can identify coronary artery disease (CAD) sub-phenotypes using electronic health records (EHR) and questionnaire data from UK Biobank (UKB). Design: Case-control and cohort study. Setting: Prospective cohort study of 502K individuals aged 40–69 years recruited between 2006–2010 into the UK Biobank with linked hospitalization and mortality data and genotyping. Participants: We included all individuals for phenotyping into 6 predefined CAD phenotypes using hospital admission and procedure codes, mortality records and baseline survey data. Of these, 408,470 unrelated individuals of European descent had a polygenic risk score (PRS) for CAD estimated. Exposure: CAD Phenotypes. Main outcomes and measures: Association with baseline risk factors, mortality (n = 14,419 over 7.8 years median f/u), and a PRS for CAD. Results: The algorithm classified individuals with CAD into prevalent MI (n = 4,900); incident MI (n = 4,621), prevalent CAD without MI (n = 10,910), incident CAD without MI (n = 8,668), prevalent self-reported MI (n = 2,754); prevalent self-reported CAD without MI (n = 5,623), yielding 37,476 individuals with any type of CAD. Risk factors were similar across the six CAD phenotypes, except for fewer men in the self-reported CAD without MI group (46.7% v 70.1% for the overall group). In age- and sex- adjusted survival analyses, mortality was highest following incident MI (HR 6.66, 95% CI 6.07–7.31) and lowest for prevalent self-reported CAD without MI at baseline (HR 1.31, 95% CI 1.15–1.50) compared to disease-free controls. There were similar graded associations across the six phenotypes per SD increase in PRS, with the strongest association for prevalent MI (OR 1.50, 95% CI 1.46–1.55) and the weakest for prevalent self-reported CAD without MI (OR 1.08, 95% CI 1.05–1.12). The algorithm is available in the open phenotype HDR UK phenotype library (https://portal.caliberresearch.org/). Conclusions: An algorithmic, EHR-based approach distinguished six phenotypes of CAD with distinct survival and PRS associations, supporting adoption of open approaches to help standardize CAD phenotyping and its wider potential value for reproducible research in other conditions.

Suggested Citation

  • Riyaz S Patel & Spiros Denaxas & Laurence J Howe & Rosalind M Eggo & Anoop D Shah & Naomi E Allen & John Danesh & Aroon Hingorani & Cathie Sudlow & Harry Hemingway, 2022. "Reproducible disease phenotyping at scale: Example of coronary artery disease in UK Biobank," PLOS ONE, Public Library of Science, vol. 17(4), pages 1-15, April.
  • Handle: RePEc:plo:pone00:0264828
    DOI: 10.1371/journal.pone.0264828
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0264828
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0264828&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0264828?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Clare Bycroft & Colin Freeman & Desislava Petkova & Gavin Band & Lloyd T. Elliott & Kevin Sharp & Allan Motyer & Damjan Vukcevic & Olivier Delaneau & Jared O’Connell & Adrian Cortes & Samantha Welsh &, 2018. "The UK Biobank resource with deep phenotyping and genomic data," Nature, Nature, vol. 562(7726), pages 203-209, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Matteo Di Scipio & Mohammad Khan & Shihong Mao & Michael Chong & Conor Judge & Nazia Pathan & Nicolas Perrot & Walter Nelson & Ricky Lali & Shuang Di & Robert Morton & Jeremy Petch & Guillaume Paré, 2023. "A versatile, fast and unbiased method for estimation of gene-by-environment interaction effects on biobank-scale datasets," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    2. Jacob Joseph & Chang Liu & Qin Hui & Krishna Aragam & Zeyuan Wang & Brian Charest & Jennifer E. Huffman & Jacob M. Keaton & Todd L. Edwards & Serkalem Demissie & Luc Djousse & Juan P. Casas & J. Micha, 2022. "Genetic architecture of heart failure with preserved versus reduced ejection fraction," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    3. Vincent Michaud & Eulalie Lasseaux & David J. Green & Dave T. Gerrard & Claudio Plaisant & Tomas Fitzgerald & Ewan Birney & Benoît Arveiler & Graeme C. Black & Panagiotis I. Sergouniotis, 2022. "The contribution of common regulatory and protein-coding TYR variants to the genetic architecture of albinism," Nature Communications, Nature, vol. 13(1), pages 1-8, December.
    4. Lili Liu & Atlas Khan & Elena Sanchez-Rodriguez & Francesca Zanoni & Yifu Li & Nicholas Steers & Olivia Balderes & Junying Zhang & Priya Krithivasan & Robert A. LeDesma & Clara Fischman & Scott J. Heb, 2022. "Genetic regulation of serum IgA levels and susceptibility to common immune, infectious, kidney, and cardio-metabolic traits," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    5. Sylvia Hartmann & Summaira Yasmeen & Benjamin M. Jacobs & Spiros Denaxas & Munir Pirmohamed & Eric R. Gamazon & Mark J. Caulfield & Harry Hemingway & Maik Pietzner & Claudia Langenberg, 2023. "ADRA2A and IRX1 are putative risk genes for Raynaud’s phenomenon," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    6. Mit Shah & Marco H. A. Inácio & Chang Lu & Pierre-Raphaël Schiratti & Sean L. Zheng & Adam Clement & Antonio Marvao & Wenjia Bai & Andrew P. King & James S. Ware & Martin R. Wilkins & Johanna Mielke &, 2023. "Environmental and genetic predictors of human cardiovascular ageing," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    7. Mathias Seviiri & Matthew H. Law & Jue-Sheng Ong & Puya Gharahkhani & Pierre Fontanillas & Catherine M. Olsen & David C. Whiteman & Stuart MacGregor, 2022. "A multi-phenotype analysis reveals 19 susceptibility loci for basal cell carcinoma and 15 for squamous cell carcinoma," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    8. Junqing Xie & Shuo Feng & Xintong Li & Ester Gea-Mallorquí & Albert Prats-Uribe & Dani Prieto-Alhambra, 2022. "Comparative effectiveness of the BNT162b2 and ChAdOx1 vaccines against Covid-19 in people over 50," Nature Communications, Nature, vol. 13(1), pages 1-8, December.
    9. Erik Schoenmakers & Federica Marelli & Helle F. Jørgensen & W. Edward Visser & Carla Moran & Stefan Groeneweg & Carolina Avalos & Sean J. Jurgens & Nichola Figg & Alison Finigan & Neha Wali & Maura Ag, 2023. "Selenoprotein deficiency disorder predisposes to aortic aneurysm formation," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    10. Harry D Green & Alistair Jones & Jonathan P Evans & Andrew R Wood & Robin N Beaumont & Jessica Tyrrell & Timothy M Frayling & Christopher Smith & Michael N Weedon, 2021. "A genome-wide association study identifies 5 loci associated with frozen shoulder and implicates diabetes as a causal risk factor," PLOS Genetics, Public Library of Science, vol. 17(6), pages 1-13, June.
    11. Zhen Qiao & Julia Sidorenko & Joana A. Revez & Angli Xue & Xueling Lu & Katri Pärna & Harold Snieder & Peter M. Visscher & Naomi R. Wray & Loic Yengo, 2023. "Estimation and implications of the genetic architecture of fasting and non-fasting blood glucose," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    12. Xiaoyi Raymond Gao & Marion Chiariglione & Alexander J. Arch, 2022. "Whole-exome sequencing study identifies rare variants and genes associated with intraocular pressure and glaucoma," Nature Communications, Nature, vol. 13(1), pages 1-10, December.
    13. Romain Fournier & Zoi Tsangalidou & David Reich & Pier Francesco Palamara, 2023. "Haplotype-based inference of recent effective population size in modern and ancient DNA samples," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    14. Nicole Deflaux & Margaret Sunitha Selvaraj & Henry Robert Condon & Kelsey Mayo & Sara Haidermota & Melissa A. Basford & Chris Lunt & Anthony A. Philippakis & Dan M. Roden & Joshua C. Denny & Anjene Mu, 2023. "Demonstrating paths for unlocking the value of cloud genomics through cross cohort analysis," Nature Communications, Nature, vol. 14(1), pages 1-10, December.
    15. George B. Busby & Scott Kulm & Alessandro Bolli & Jen Kintzle & Paolo Di Domenico & Giordano Bottà, 2023. "Ancestry-specific polygenic risk scores are risk enhancers for clinical cardiovascular disease assessments," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    16. Dixon, Padraig & Hollingworth, William & Harrison, Sean & Davies, Neil M. & Davey Smith, George, 2020. "Mendelian Randomization analysis of the causal effect of adiposity on hospital costs," Journal of Health Economics, Elsevier, vol. 70(C).
    17. van den Berg, Gerard J. & von Hinke, Stephanie & Wang, R. Adele H., 2022. "Prenatal Sugar Consumption and Late-Life Human Capital and Health: Analyses Based on Postwar Rationing and Polygenic Scores," IZA Discussion Papers 15544, Institute of Labor Economics (IZA).
    18. Xingjie Hao & Zhonghe Shao & Ning Zhang & Minghui Jiang & Xi Cao & Si Li & Yunlong Guan & Chaolong Wang, 2023. "Integrative genome-wide analyses identify novel loci associated with kidney stones and provide insights into its genetic architecture," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    19. Ruoyu Tian & Tian Ge & Hyeokmoon Kweon & Daniel B. Rocha & Max Lam & Jimmy Z. Liu & Kritika Singh & Daniel F. Levey & Joel Gelernter & Murray B. Stein & Ellen A. Tsai & Hailiang Huang & Christopher F., 2024. "Whole-exome sequencing in UK Biobank reveals rare genetic architecture for depression," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    20. Nazia Pathan & Wei Q. Deng & Matteo Di Scipio & Mohammad Khan & Shihong Mao & Robert W. Morton & Ricky Lali & Marie Pigeyre & Michael R. Chong & Guillaume Paré, 2024. "A method to estimate the contribution of rare coding variants to complex trait heritability," Nature Communications, Nature, vol. 15(1), pages 1-16, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0264828. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.