IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1004714.html
   My bibliography  Save this article

Fast and Rigorous Computation of Gene and Pathway Scores from SNP-Based Summary Statistics

Author

Listed:
  • David Lamparter
  • Daniel Marbach
  • Rico Rueedi
  • Zoltán Kutalik
  • Sven Bergmann

Abstract

Integrating single nucleotide polymorphism (SNP) p-values from genome-wide association studies (GWAS) across genes and pathways is a strategy to improve statistical power and gain biological insight. Here, we present Pascal (Pathway scoring algorithm), a powerful tool for computing gene and pathway scores from SNP-phenotype association summary statistics. For gene score computation, we implemented analytic and efficient numerical solutions to calculate test statistics. We examined in particular the sum and the maximum of chi-squared statistics, which measure the strongest and the average association signals per gene, respectively. For pathway scoring, we use a modified Fisher method, which offers not only significant power improvement over more traditional enrichment strategies, but also eliminates the problem of arbitrary threshold selection inherent in any binary membership based pathway enrichment approach. We demonstrate the marked increase in power by analyzing summary statistics from dozens of large meta-studies for various traits. Our extensive testing indicates that our method not only excels in rigorous type I error control, but also results in more biologically meaningful discoveries.Author Summary: Genome-wide association studies (GWAS) typically generate lists of trait- or disease-associated SNPs. Yet, such output sheds little light on the underlying molecular mechanisms and tools are needed to extract biological insight from the results at the SNP level. Pathway analysis tools integrate signals from multiple SNPs at various positions in the genome in order to map associated genomic regions to well-established pathways, i.e., sets of genes known to act in concert. The nature of GWAS association results requires specifically tailored methods for this task. Here, we present Pascal (Pathway scoring algorithm), a tool that allows gene and pathway-level analysis of GWAS association results without the need to access the original genotypic data. Pascal was designed to be fast, accurate and to have high power to detect relevant pathways. We extensively tested our approach on a large collection of real GWAS association results and saw better discovery of confirmed pathways than with other popular methods. We believe that these results together with the ease-of-use of our publicly available software will allow Pascal to become a useful addition to the toolbox of the GWAS community.

Suggested Citation

  • David Lamparter & Daniel Marbach & Rico Rueedi & Zoltán Kutalik & Sven Bergmann, 2016. "Fast and Rigorous Computation of Gene and Pathway Scores from SNP-Based Summary Statistics," PLOS Computational Biology, Public Library of Science, vol. 12(1), pages 1-20, January.
  • Handle: RePEc:plo:pcbi00:1004714
    DOI: 10.1371/journal.pcbi.1004714
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004714
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1004714&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1004714?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Duchesne, Pierre & Lafaye De Micheaux, Pierre, 2010. "Computing the distribution of quadratic forms: Further comparisons between the Liu-Tang-Zhang approximation and exact methods," Computational Statistics & Data Analysis, Elsevier, vol. 54(4), pages 858-862, April.
    2. R. W. Farebrother, 1984. "The Distribution of a Positive Linear Combination of X2 Random Variables," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 33(3), pages 332-339, November.
    3. Ayellet V Segrè & DIAGRAM Consortium & MAGIC investigators & Leif Groop & Vamsi K Mootha & Mark J Daly & David Altshuler, 2010. "Common Inherited Variation in Mitochondrial Genes Is Not Enriched for Associations with Type 2 Diabetes or Related Glycemic Traits," PLOS Genetics, Public Library of Science, vol. 6(8), pages 1-19, August.
    4. Tune H. Pers & Juha M. Karjalainen & Yingleong Chan & Harm-Jan Westra & Andrew R. Wood & Jian Yang & Julian C. Lui & Sailaja Vedantam & Stefan Gustafsson & Tonu Esko & Tim Frayling & Elizabeth K. Spel, 2015. "Biological interpretation of genome-wide association studies using predicted gene functions," Nature Communications, Nature, vol. 6(1), pages 1-9, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Tapati Basak & Kazuhisa Nagashima & Satoshi Kajimoto & Takahisa Kawaguchi & Yasuharu Tabara & Fumihiko Matsuda & Ryo Yamada, 2020. "A Geometry-Based Multiple Testing Correction for Contingency Tables by Truncated Normal Distribution," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 12(1), pages 63-77, April.
    2. Go Sato & Yuya Shirai & Shinichi Namba & Ryuya Edahiro & Kyuto Sonehara & Tsuyoshi Hata & Mamoru Uemura & Koichi Matsuda & Yuichiro Doki & Hidetoshi Eguchi & Yukinori Okada, 2023. "Pan-cancer and cross-population genome-wide association studies dissect shared genetic backgrounds underlying carcinogenesis," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    3. Akshat Singhal & Song Cao & Christopher Churas & Dexter Pratt & Santo Fortunato & Fan Zheng & Trey Ideker, 2020. "Multiscale community detection in Cytoscape," PLOS Computational Biology, Public Library of Science, vol. 16(10), pages 1-10, October.
    4. Olga A Vsevolozhskaya & Min Shi & Fengjiao Hu & Dmitri V Zaykin, 2020. "DOT: Gene-set analysis by combining decorrelated association statistics," PLOS Computational Biology, Public Library of Science, vol. 16(4), pages 1-25, April.
    5. Francesca Mateo & Zhengcheng He & Lin Mei & Gorka Ruiz de Garibay & Carmen Herranz & Nadia García & Amanda Lorentzian & Alexandra Baiges & Eline Blommaert & Antonio Gómez & Oriol Mirallas & Anna Garri, 2022. "Modification of BRCA1-associated breast cancer risk by HMMR overexpression," Nature Communications, Nature, vol. 13(1), pages 1-16, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lina Cai & Tomas Gonzales & Eleanor Wheeler & Nicola D. Kerrison & Felix R. Day & Claudia Langenberg & John R. B. Perry & Soren Brage & Nicholas J. Wareham, 2023. "Causal associations between cardiorespiratory fitness and type 2 diabetes," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    2. Chen, Tong & Lumley, Thomas, 2019. "Numerical evaluation of methods approximating the distribution of a large quadratic form in normal variables," Computational Statistics & Data Analysis, Elsevier, vol. 139(C), pages 75-81.
    3. Jacob Joseph & Chang Liu & Qin Hui & Krishna Aragam & Zeyuan Wang & Brian Charest & Jennifer E. Huffman & Jacob M. Keaton & Todd L. Edwards & Serkalem Demissie & Luc Djousse & Juan P. Casas & J. Micha, 2022. "Genetic architecture of heart failure with preserved versus reduced ejection fraction," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    4. Jiménez-Gamero, M.D. & Alba-Fernández, M.V. & Jodrá, P. & Barranco-Chamorro, I., 2017. "Fast tests for the two-sample problem based on the empirical characteristic function," Mathematics and Computers in Simulation (MATCOM), Elsevier, vol. 137(C), pages 390-410.
    5. Lili Liu & Atlas Khan & Elena Sanchez-Rodriguez & Francesca Zanoni & Yifu Li & Nicholas Steers & Olivia Balderes & Junying Zhang & Priya Krithivasan & Robert A. LeDesma & Clara Fischman & Scott J. Heb, 2022. "Genetic regulation of serum IgA levels and susceptibility to common immune, infectious, kidney, and cardio-metabolic traits," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    6. Brittany L. Mitchell & Jake R. Saklatvala & Nick Dand & Fiona A. Hagenbeek & Xin Li & Josine L. Min & Laurent Thomas & Meike Bartels & Jouke Hottenga & Michelle K. Lupton & Dorret I. Boomsma & Xianjun, 2022. "Genome-wide association meta-analysis identifies 29 new acne susceptibility loci," Nature Communications, Nature, vol. 13(1), pages 1-9, December.
    7. Junjiao Feng & Liang Zhang & Chunhui Chen & Jintao Sheng & Zhifang Ye & Kanyin Feng & Jing Liu & Ying Cai & Bi Zhu & Zhaoxia Yu & Chuansheng Chen & Qi Dong & Gui Xue, 2022. "A cognitive neurogenetic approach to uncovering the structure of executive functions," Nature Communications, Nature, vol. 13(1), pages 1-19, December.
    8. Pötscher, Benedikt M. & Preinerstorfer, David, 2021. "Valid Heteroskedasticity Robust Testing," MPRA Paper 107420, University Library of Munich, Germany.
    9. Ghiglietti, Andrea & Paganoni, Anna Maria, 2017. "Exact tests for the means of Gaussian stochastic processes," Statistics & Probability Letters, Elsevier, vol. 131(C), pages 102-107.
    10. Aysu Okbay & Jonathan P. Beauchamp & Mark Alan Fontana & James J. Lee & Tune H. Pers & Cornelius A. Rietveld & Patrick Turley & Guo-Bo Chen & Valur Emilsson & S. Fleur W. Meddens & Sven Oskarsson & Jo, 2016. "Genome-wide association study identifies 74 loci associated with educational attainment," Nature, Nature, vol. 533(7604), pages 539-542, May.
    11. Michael G. Levin & Noah L. Tsao & Pankhuri Singhal & Chang Liu & Ha My T. Vy & Ishan Paranjpe & Joshua D. Backman & Tiffany R. Bellomo & William P. Bone & Kiran J. Biddinger & Qin Hui & Ozan Dikilitas, 2022. "Genome-wide association and multi-trait analyses characterize the common genetic architecture of heart failure," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    12. Niina Sandholm & Rany M Salem & Amy Jayne McKnight & Eoin P Brennan & Carol Forsblom & Tamara Isakova & Gareth J McKay & Winfred W Williams & Denise M Sadlier & Ville-Petteri Mäkinen & Elizabeth J Swa, 2012. "New Susceptibility Loci Associated with Kidney Disease in Type 1 Diabetes," PLOS Genetics, Public Library of Science, vol. 8(9), pages 1-13, September.
    13. Popović, Božidar V. & Mijanović, Andjela & Genç, Ali İ., 2020. "On linear combination of generalized logistic random variables with an application to financial returns," Applied Mathematics and Computation, Elsevier, vol. 381(C).
    14. Sofer Tamar, 2017. "Confidence intervals for heritability via Haseman-Elston regression," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 16(4), pages 259-273, September.
    15. Sanae Rujivan & Athinan Sutchada & Kittisak Chumpong & Napat Rujeerapaiboon, 2023. "Analytically Computing the Moments of a Conic Combination of Independent Noncentral Chi-Square Random Variables and Its Application for the Extended Cox–Ingersoll–Ross Process with Time-Varying Dimens," Mathematics, MDPI, vol. 11(5), pages 1-29, March.
    16. Roberto Colombi & Sabrina Giordano, 2019. "Likelihood-based tests for a class of misspecified finite mixture models for ordinal categorical data," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(4), pages 1175-1202, December.
    17. Colombi, Roberto, 2020. "Selection tests for possibly misspecified hierarchical multinomial marginal models," Econometrics and Statistics, Elsevier, vol. 16(C), pages 136-147.
    18. Kristina M. Garske & Asha Kar & Caroline Comenho & Brunilda Balliu & David Z. Pan & Yash V. Bhagat & Gregory Rosenberg & Amogha Koka & Sankha Subhra Das & Zong Miao & Janet S. Sinsheimer & Jaakko Kapr, 2023. "Increased body mass index is linked to systemic inflammation through altered chromatin co-accessibility in human preadipocytes," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    19. Zhao Wang & Qian Liang & Xinyi Qian & Bolang Hu & Zhanye Zheng & Jianhua Wang & Yuelin Hu & Zhengkai Bao & Ke Zhao & Yao Zhou & Xiangling Feng & Xianfu Yi & Jin Li & Jiandang Shi & Zhe Liu & Jihui Hao, 2023. "An autoimmune pleiotropic SNP modulates IRF5 alternative promoter usage through ZBTB3-mediated chromatin looping," Nature Communications, Nature, vol. 14(1), pages 1-23, December.
    20. Li, Muyi & Zhang, Yanfen, 2022. "Bootstrapping multivariate portmanteau tests for vector autoregressive models with weak assumptions on errors," Computational Statistics & Data Analysis, Elsevier, vol. 165(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1004714. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.