IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v12y2021i1d10.1038_s41467-021-25171-9.html
   My bibliography  Save this article

Incorporating functional priors improves polygenic prediction accuracy in UK Biobank and 23andMe data sets

Author

Listed:
  • Carla Márquez-Luna

    (Harvard T.H. Chan School of Public Health
    Broad Institute of Harvard and MIT
    Icahn School of Medicine at Mount Sinai)

  • Steven Gazal

    (Broad Institute of Harvard and MIT
    Icahn School of Medicine at Mount Sinai)

  • Po-Ru Loh

    (Broad Institute of Harvard and MIT
    Harvard T.H. Chan School of Public Health
    Brigham and Women’s Hospital and Harvard Medical School)

  • Samuel S. Kim

    (Broad Institute of Harvard and MIT
    Massachusetts Institute of Technology)

  • Nicholas Furlotte

    (23andMe Inc.)

  • Adam Auton

    (23andMe Inc.)

  • Alkes L. Price

    (Harvard T.H. Chan School of Public Health
    Broad Institute of Harvard and MIT
    Harvard T.H. Chan School of Public Health)

Abstract

Polygenic risk prediction is a widely investigated topic because of its promising clinical applications. Genetic variants in functional regions of the genome are enriched for complex trait heritability. Here, we introduce a method for polygenic prediction, LDpred-funct, that leverages trait-specific functional priors to increase prediction accuracy. We fit priors using the recently developed baseline-LD model, including coding, conserved, regulatory, and LD-related annotations. We analytically estimate posterior mean causal effect sizes and then use cross-validation to regularize these estimates, improving prediction accuracy for sparse architectures. We applied LDpred-funct to predict 21 highly heritable traits in the UK Biobank (avg N = 373 K as training data). LDpred-funct attained a +4.6% relative improvement in average prediction accuracy (avg prediction R2 = 0.144; highest R2 = 0.413 for height) compared to SBayesR (the best method that does not incorporate functional information). For height, meta-analyzing training data from UK Biobank and 23andMe cohorts (N = 1107 K) increased prediction R2 to 0.431. Our results show that incorporating functional priors improves polygenic prediction accuracy, consistent with the functional architecture of complex traits.

Suggested Citation

  • Carla Márquez-Luna & Steven Gazal & Po-Ru Loh & Samuel S. Kim & Nicholas Furlotte & Adam Auton & Alkes L. Price, 2021. "Incorporating functional priors improves polygenic prediction accuracy in UK Biobank and 23andMe data sets," Nature Communications, Nature, vol. 12(1), pages 1-11, December.
  • Handle: RePEc:nat:natcom:v:12:y:2021:i:1:d:10.1038_s41467-021-25171-9
    DOI: 10.1038/s41467-021-25171-9
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-021-25171-9
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-021-25171-9?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Tian Ge & Chia-Yen Chen & Yang Ni & Yen-Chen Anne Feng & Jordan W. Smoller, 2019. "Polygenic prediction via Bayesian regression and continuous shrinkage priors," Nature Communications, Nature, vol. 10(1), pages 1-10, December.
    2. Armin P. Schoech & Daniel M. Jordan & Po-Ru Loh & Steven Gazal & Luke J. O’Connor & Daniel J. Balick & Pier F. Palamara & Hilary K. Finucane & Shamil R. Sunyaev & Alkes L. Price, 2019. "Quantification of frequency-dependent genetic architectures in 25 UK Biobank traits reveals action of negative selection," Nature Communications, Nature, vol. 10(1), pages 1-10, December.
    3. Clare Bycroft & Colin Freeman & Desislava Petkova & Gavin Band & Lloyd T. Elliott & Kevin Sharp & Allan Motyer & Damjan Vukcevic & Olivier Delaneau & Jared O’Connell & Adrian Cortes & Samantha Welsh &, 2018. "The UK Biobank resource with deep phenotyping and genomic data," Nature, Nature, vol. 562(7726), pages 203-209, October.
    4. Ying Wang & Jing Guo & Guiyan Ni & Jian Yang & Peter M. Visscher & Loic Yengo, 2020. "Theoretical and empirical quantification of the accuracy of polygenic scores in ancestry divergent populations," Nature Communications, Nature, vol. 11(1), pages 1-9, December.
    5. Xiang Zhou & Peter Carbonetto & Matthew Stephens, 2013. "Polygenic Modeling with Bayesian Sparse Linear Mixed Models," PLOS Genetics, Public Library of Science, vol. 9(2), pages 1-14, February.
    6. Jianxin Shi & Ju-Hyun Park & Jubao Duan & Sonja T Berndt & Winton Moy & Kai Yu & Lei Song & William Wheeler & Xing Hua & Debra Silverman & Montserrat Garcia-Closas & Chao Agnes Hsiung & Jonine D Figue, 2016. "Winner's Curse Correction and Variable Thresholding Improve Performance of Polygenic Risk Modeling Based on Genome-Wide Association Study Summary-Level Data," PLOS Genetics, Public Library of Science, vol. 12(12), pages 1-24, December.
    7. Gerhard Moser & Sang Hong Lee & Ben J Hayes & Michael E Goddard & Naomi R Wray & Peter M Visscher, 2015. "Simultaneous Discovery, Estimation and Prediction Analysis of Complex Traits Using a Bayesian Mixture Model," PLOS Genetics, Public Library of Science, vol. 11(4), pages 1-22, April.
    8. Yiming Hu & Qiongshi Lu & Ryan Powles & Xinwei Yao & Can Yang & Fang Fang & Xinran Xu & Hongyu Zhao, 2017. "Leveraging functional annotations in genetic risk prediction for human complex diseases," PLOS Computational Biology, Public Library of Science, vol. 13(6), pages 1-16, June.
    9. Luke R. Lloyd-Jones & Jian Zeng & Julia Sidorenko & Loïc Yengo & Gerhard Moser & Kathryn E. Kemper & Huanwei Wang & Zhili Zheng & Reedik Magi & Tõnu Esko & Andres Metspalu & Naomi R. Wray & Michael E., 2019. "Improved polygenic prediction by Bayesian multiple regression on summary statistics," Nature Communications, Nature, vol. 10(1), pages 1-11, December.
    10. Robert M. Maier & Zhihong Zhu & Sang Hong Lee & Maciej Trzaskowski & Douglas M. Ruderfer & Eli A. Stahl & Stephan Ripke & Naomi R. Wray & Jian Yang & Peter M. Visscher & Matthew R. Robinson, 2018. "Improving genetic prediction by leveraging genetic correlations among human diseases and traits," Nature Communications, Nature, vol. 9(1), pages 1-17, December.
    11. L. Duncan & H. Shen & B. Gelaye & J. Meijsen & K. Ressler & M. Feldman & R. Peterson & B. Domingue, 2019. "Analysis of polygenic risk score usage and performance in diverse human populations," Nature Communications, Nature, vol. 10(1), pages 1-9, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Brieuc Lehmann & Maxine Mackintosh & Gil McVean & Chris Holmes, 2023. "Optimal strategies for learning multi-ancestry polygenic scores vary across traits," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    2. Pereira, Rita & Biroli, Pietro & von hinke, stephanie & Van Kippersluis, Hans & Galama, Titus & Rietveld, Niels & Thom, Kevin, 2022. "Gene-Environment Interplay in the Social Sciences," OSF Preprints d96z3, Center for Open Science.
    3. Jiacheng Miao & Hanmin Guo & Gefei Song & Zijie Zhao & Lin Hou & Qiongshi Lu, 2023. "Quantifying portable genetic effects and improving cross-ancestry genetic prediction with GWAS summary statistics," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    4. Clara Albiñana & Zhihong Zhu & Andrew J. Schork & Andrés Ingason & Hugues Aschard & Isabell Brikell & Cynthia M. Bulik & Liselotte V. Petersen & Esben Agerbo & Jakob Grove & Merete Nordentoft & David , 2023. "Multi-PGS enhances polygenic prediction by combining 937 polygenic scores," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    5. Zhen Qiao & Julia Sidorenko & Joana A. Revez & Angli Xue & Xueling Lu & Katri Pärna & Harold Snieder & Peter M. Visscher & Naomi R. Wray & Loic Yengo, 2023. "Estimation and implications of the genetic architecture of fasting and non-fasting blood glucose," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    6. Yosuke Tanigawa & Junyang Qian & Guhan Venkataraman & Johanne Marie Justesen & Ruilin Li & Robert Tibshirani & Trevor Hastie & Manuel A Rivas, 2022. "Significant sparse polygenic risk scores across 813 traits in UK Biobank," PLOS Genetics, Public Library of Science, vol. 18(3), pages 1-21, March.
    7. Junyang Qian & Yosuke Tanigawa & Wenfei Du & Matthew Aguirre & Chris Chang & Robert Tibshirani & Manuel A Rivas & Trevor Hastie, 2020. "A fast and scalable framework for large-scale and ultrahigh-dimensional sparse regression with application to the UK Biobank," PLOS Genetics, Public Library of Science, vol. 16(10), pages 1-30, October.
    8. Wei Jiang & Ling Chen & Matthew J. Girgenti & Hongyu Zhao, 2024. "Tuning parameters for polygenic risk score methods using GWAS summary statistics from training data," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    9. George B. Busby & Scott Kulm & Alessandro Bolli & Jen Kintzle & Paolo Di Domenico & Giordano Bottà, 2023. "Ancestry-specific polygenic risk scores are risk enhancers for clinical cardiovascular disease assessments," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    10. Ruoyu Tian & Tian Ge & Hyeokmoon Kweon & Daniel B. Rocha & Max Lam & Jimmy Z. Liu & Kritika Singh & Daniel F. Levey & Joel Gelernter & Murray B. Stein & Ellen A. Tsai & Hailiang Huang & Christopher F., 2024. "Whole-exome sequencing in UK Biobank reveals rare genetic architecture for depression," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    11. Magdalena Zimoń & Yunfeng Huang & Anthi Trasta & Aliaksandr Halavatyi & Jimmy Z. Liu & Chia-Yen Chen & Peter Blattmann & Bernd Klaus & Christopher D. Whelan & David Sexton & Sally John & Wolfgang Hube, 2021. "Pairwise effects between lipid GWAS genes modulate lipid plasma levels and cellular uptake," Nature Communications, Nature, vol. 12(1), pages 1-16, December.
    12. Jingning Zhang & Jianan Zhan & Jin Jin & Cheng Ma & Ruzhang Zhao & Jared O’Connell & Yunxuan Jiang & Bertram L. Koelsch & Haoyu Zhang & Nilanjan Chatterjee, 2024. "An ensemble penalized regression method for multi-ancestry polygenic risk prediction," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    13. Song Zhai & Hong Zhang & Devan V. Mehrotra & Judong Shen, 2022. "Pharmacogenomics polygenic risk score for drug response prediction using PRS-PGx methods," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    14. Wenhan Chen & Yang Wu & Zhili Zheng & Ting Qi & Peter M. Visscher & Zhihong Zhu & Jian Yang, 2021. "Improved analyses of GWAS summary statistics by reducing data heterogeneity and errors," Nature Communications, Nature, vol. 12(1), pages 1-10, December.
    15. Geyu Zhou & Hongyu Zhao, 2021. "A fast and robust Bayesian nonparametric method for prediction of complex traits using summary statistics," PLOS Genetics, Public Library of Science, vol. 17(7), pages 1-17, July.
    16. Parsa Akbari & Olukayode A. Sosina & Jonas Bovijn & Karl Landheer & Jonas B. Nielsen & Minhee Kim & Senem Aykul & Tanima De & Mary E. Haas & George Hindy & Nan Lin & Ian R. Dinsmore & Jonathan Z. Luo , 2022. "Multiancestry exome sequencing reveals INHBE mutations associated with favorable fat distribution and protection from diabetes," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    17. Injeong Shim & Hiroyuki Kuwahara & NingNing Chen & Mais O. Hashem & Lama AlAbdi & Mohamed Abouelhoda & Hong-Hee Won & Pradeep Natarajan & Patrick T. Ellinor & Amit V. Khera & Xin Gao & Fowzan S. Alkur, 2023. "Clinical utility of polygenic scores for cardiometabolic disease in Arabs," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    18. Yanyi Song & Xiang Zhou & Min Zhang & Wei Zhao & Yongmei Liu & Sharon L. R. Kardia & Ana V. Diez Roux & Belinda L. Needham & Jennifer A. Smith & Bhramar Mukherjee, 2020. "Bayesian shrinkage estimation of high dimensional causal mediation effects in omics studies," Biometrics, The International Biometric Society, vol. 76(3), pages 700-710, September.
    19. Hui Li & Rahul Mazumder & Xihong Lin, 2023. "Accurate and efficient estimation of local heritability using summary statistics and the linkage disequilibrium matrix," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    20. Young Jin Kim & Sanghoon Moon & Mi Yeong Hwang & Sohee Han & Hye-Mi Jang & Jinhwa Kong & Dong Mun Shin & Kyungheon Yoon & Sung Min Kim & Jong-Eun Lee & Anubha Mahajan & Hyun-Young Park & Mark I. McCar, 2022. "The contribution of common and rare genetic variants to variation in metabolic traits in 288,137 East Asians," Nature Communications, Nature, vol. 13(1), pages 1-13, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:12:y:2021:i:1:d:10.1038_s41467-021-25171-9. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.