IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v13y2022i1d10.1038_s41467-022-34932-z.html
   My bibliography  Save this article

GhostKnockoff inference empowers identification of putative causal variants in genome-wide association studies

Author

Listed:
  • Zihuai He

    (Stanford University
    Stanford University)

  • Linxi Liu

    (University of Pittsburgh)

  • Michael E. Belloy

    (Stanford University)

  • Yann Guen

    (Stanford University
    Institut du Cerveau - Paris Brain Institute - ICM)

  • Aaron Sossin

    (Stanford University)

  • Xiaoxia Liu

    (Stanford University)

  • Xinran Qi

    (Stanford University)

  • Shiyang Ma

    (Columbia University)

  • Prashnna K. Gyawali

    (Stanford University)

  • Tony Wyss-Coray

    (Stanford University)

  • Hua Tang

    (Stanford University)

  • Chiara Sabatti

    (Stanford University)

  • Emmanuel Candès

    (Stanford University
    Stanford University)

  • Michael D. Greicius

    (Stanford University)

  • Iuliana Ionita-Laza

    (Columbia University)

Abstract

Recent advances in genome sequencing and imputation technologies provide an exciting opportunity to comprehensively study the contribution of genetic variants to complex phenotypes. However, our ability to translate genetic discoveries into mechanistic insights remains limited at this point. In this paper, we propose an efficient knockoff-based method, GhostKnockoff, for genome-wide association studies (GWAS) that leads to improved power and ability to prioritize putative causal variants relative to conventional GWAS approaches. The method requires only Z-scores from conventional GWAS and hence can be easily applied to enhance existing and future studies. The method can also be applied to meta-analysis of multiple GWAS allowing for arbitrary sample overlap. We demonstrate its performance using empirical simulations and two applications: (1) a meta-analysis for Alzheimer’s disease comprising nine overlapping large-scale GWAS, whole-exome and whole-genome sequencing studies and (2) analysis of 1403 binary phenotypes from the UK Biobank data in 408,961 samples of European ancestry. Our results demonstrate that GhostKnockoff can identify putatively functional variants with weaker statistical effects that are missed by conventional association tests.

Suggested Citation

  • Zihuai He & Linxi Liu & Michael E. Belloy & Yann Guen & Aaron Sossin & Xiaoxia Liu & Xinran Qi & Shiyang Ma & Prashnna K. Gyawali & Tony Wyss-Coray & Hua Tang & Chiara Sabatti & Emmanuel Candès & Mich, 2022. "GhostKnockoff inference empowers identification of putative causal variants in genome-wide association studies," Nature Communications, Nature, vol. 13(1), pages 1-16, December.
  • Handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-34932-z
    DOI: 10.1038/s41467-022-34932-z
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-022-34932-z
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-022-34932-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Matteo Sesia & Eugene Katsevich & Stephen Bates & Emmanuel Candès & Chiara Sabatti, 2020. "Multi-resolution localization of causal variants across the genome," Nature Communications, Nature, vol. 11(1), pages 1-10, December.
    2. M Sesia & C Sabatti & E J Candès, 2019. "Gene hunting with hidden Markov model knockoffs," Biometrika, Biometrika Trust, vol. 106(1), pages 1-18.
    3. Konrad J. Karczewski & Laurent C. Francioli & Grace Tiao & Beryl B. Cummings & Jessica Alföldi & Qingbo Wang & Ryan L. Collins & Kristen M. Laricchia & Andrea Ganna & Daniel P. Birnbaum & Laura D. Gau, 2020. "The mutational constraint spectrum quantified from variation in 141,456 humans," Nature, Nature, vol. 581(7809), pages 434-443, May.
    4. Alex Wells & David Heckerman & Ali Torkamani & Li Yin & Jonathan Sebat & Bing Ren & Amalio Telenti & Julia Iulio, 2019. "Ranking of non-coding pathogenic variants and putative essential regions of the human genome," Nature Communications, Nature, vol. 10(1), pages 1-9, December.
    5. Emmanuel Candès & Yingying Fan & Lucas Janson & Jinchi Lv, 2018. "Panning for gold: ‘model‐X’ knockoffs for high dimensional controlled variable selection," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 80(3), pages 551-577, June.
    6. M Sesia & C Sabatti & E J Candès, 2019. "Rejoinder: ‘Gene hunting with hidden Markov model knockoffs’," Biometrika, Biometrika Trust, vol. 106(1), pages 35-45.
    7. Daniel Taliun & Daniel N. Harris & Michael D. Kessler & Jedidiah Carlson & Zachary A. Szpiech & Raul Torres & Sarah A. Gagliano Taliun & André Corvelo & Stephanie M. Gogarten & Hyun Min Kang & Achille, 2021. "Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program," Nature, Nature, vol. 590(7845), pages 290-299, February.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Emmanuel Candès & Chiara Sabatti, 2020. "Discussion of the Paper “Prediction, Estimation, and Attribution” by B. Efron," International Statistical Review, International Statistical Institute, vol. 88(S1), pages 60-63, December.
    2. Ruth Heller, 2020. "Comments on: Hierarchical inference for genome-wide association studies: a view on methodology with software," Computational Statistics, Springer, vol. 35(1), pages 51-55, March.
    3. L Bottolo & S Richardson, 2019. "Discussion of ‘Gene hunting with hidden Markov model knockoffs’," Biometrika, Biometrika Trust, vol. 106(1), pages 19-22.
    4. Nikolaos Ignatiadis & Wolfgang Huber, 2021. "Covariate powered cross‐weighted multiple testing," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(4), pages 720-751, September.
    5. Quan Sun & Bryce T. Rowland & Jiawen Chen & Anna V. Mikhaylova & Christy Avery & Ulrike Peters & Jessica Lundin & Tara Matise & Steve Buyske & Ran Tao & Rasika A. Mathias & Alexander P. Reiner & Paul , 2024. "Improving polygenic risk prediction in admixed populations by explicitly modeling ancestral-differential effects via GAUDI," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    6. Vincent Michaud & Eulalie Lasseaux & David J. Green & Dave T. Gerrard & Claudio Plaisant & Tomas Fitzgerald & Ewan Birney & Benoît Arveiler & Graeme C. Black & Panagiotis I. Sergouniotis, 2022. "The contribution of common regulatory and protein-coding TYR variants to the genetic architecture of albinism," Nature Communications, Nature, vol. 13(1), pages 1-8, December.
    7. Alexendar R. Perez & Laura Sala & Richard K. Perez & Joana A. Vidigal, 2021. "CSC software corrects off-target mediated gRNA depletion in CRISPR-Cas9 essentiality screens," Nature Communications, Nature, vol. 12(1), pages 1-11, December.
    8. Elena V. Feofanova & Michael R. Brown & Taryn Alkis & Astrid M. Manuel & Xihao Li & Usman A. Tahir & Zilin Li & Kevin M. Mendez & Rachel S. Kelly & Qibin Qi & Han Chen & Martin G. Larson & Rozenn N. L, 2023. "Whole-Genome Sequencing Analysis of Human Metabolome in Multi-Ethnic Populations," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    9. Michel S. Naslavsky & Marilia O. Scliar & Guilherme L. Yamamoto & Jaqueline Yu Ting Wang & Stepanka Zverinova & Tatiana Karp & Kelly Nunes & José Ricardo Magliocco Ceroni & Diego Lima Carvalho & Carlo, 2022. "Whole-genome sequencing of 1,171 elderly admixed individuals from Brazil," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    10. Nicole Deflaux & Margaret Sunitha Selvaraj & Henry Robert Condon & Kelsey Mayo & Sara Haidermota & Melissa A. Basford & Chris Lunt & Anthony A. Philippakis & Dan M. Roden & Joshua C. Denny & Anjene Mu, 2023. "Demonstrating paths for unlocking the value of cloud genomics through cross cohort analysis," Nature Communications, Nature, vol. 14(1), pages 1-10, December.
    11. Andrea Wilderman & Eva D’haene & Machteld Baetens & Tara N. Yankee & Emma Wentworth Winchester & Nicole Glidden & Ellen Roets & Jo Dorpe & Sandra Janssens & Danny E. Miller & Miranda Galey & Kari M. B, 2024. "A distant global control region is essential for normal expression of anterior HOXA genes during mouse and human craniofacial development," Nature Communications, Nature, vol. 15(1), pages 1-23, December.
    12. Ruoyu Tian & Tian Ge & Hyeokmoon Kweon & Daniel B. Rocha & Max Lam & Jimmy Z. Liu & Kritika Singh & Daniel F. Levey & Joel Gelernter & Murray B. Stein & Ellen A. Tsai & Hailiang Huang & Christopher F., 2024. "Whole-exome sequencing in UK Biobank reveals rare genetic architecture for depression," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    13. Mary-Ellen Lynall & Blagoje Soskic & James Hayhurst & Jeremy Schwartzentruber & Daniel F. Levey & Gita A. Pathak & Renato Polimanti & Joel Gelernter & Murray B. Stein & Gosia Trynka & Menna R. Clatwor, 2022. "Genetic variants associated with psychiatric disorders are enriched at epigenetically active sites in lymphoid cells," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    14. Adrienne Tin & Pascal Schlosser & Pamela R. Matias-Garcia & Chris H. L. Thio & Roby Joehanes & Hongbo Liu & Zhi Yu & Antoine Weihs & Anselm Hoppmann & Franziska Grundner-Culemann & Josine L. Min & Vic, 2021. "Epigenome-wide association study of serum urate reveals insights into urate co-regulation and the SLC2A9 locus," Nature Communications, Nature, vol. 12(1), pages 1-18, December.
    15. Oriol Pich & Iker Reyes-Salazar & Abel Gonzalez-Perez & Nuria Lopez-Bigas, 2022. "Discovering the drivers of clonal hematopoiesis," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
    16. Magdalena Zimoń & Yunfeng Huang & Anthi Trasta & Aliaksandr Halavatyi & Jimmy Z. Liu & Chia-Yen Chen & Peter Blattmann & Bernd Klaus & Christopher D. Whelan & David Sexton & Sally John & Wolfgang Hube, 2021. "Pairwise effects between lipid GWAS genes modulate lipid plasma levels and cellular uptake," Nature Communications, Nature, vol. 12(1), pages 1-16, December.
    17. Yangci Liu & Haoming Zhai & Helen Alemayehu & Jérôme Boulanger & Lee J. Hopkins & Alicia C. Borgeaud & Christina Heroven & Jonathan D. Howe & Kendra E. Leigh & Clare E. Bryant & Yorgo Modis, 2023. "Cryo-electron tomography of NLRP3-activated ASC complexes reveals organelle co-localization," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    18. Pedro Delicado & Daniel Peña, 2023. "Understanding complex predictive models with ghost variables," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 32(1), pages 107-145, March.
    19. Parsa Akbari & Olukayode A. Sosina & Jonas Bovijn & Karl Landheer & Jonas B. Nielsen & Minhee Kim & Senem Aykul & Tanima De & Mary E. Haas & George Hindy & Nan Lin & Ian R. Dinsmore & Jonathan Z. Luo , 2022. "Multiancestry exome sequencing reveals INHBE mutations associated with favorable fat distribution and protection from diabetes," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    20. Injeong Shim & Hiroyuki Kuwahara & NingNing Chen & Mais O. Hashem & Lama AlAbdi & Mohamed Abouelhoda & Hong-Hee Won & Pradeep Natarajan & Patrick T. Ellinor & Amit V. Khera & Xin Gao & Fowzan S. Alkur, 2023. "Clinical utility of polygenic scores for cardiometabolic disease in Arabs," Nature Communications, Nature, vol. 14(1), pages 1-11, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-34932-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.