IDEAS home Printed from https://ideas.repec.org/a/plo/pgen00/1009015.html
   My bibliography  Save this article

Leveraging correlations between variants in polygenic risk scores to detect heterogeneity in GWAS cohorts

Author

Listed:
  • Jie Yuan
  • Henry Xing
  • Alexandre Louis Lamy
  • The Schizophrenia Working Group of the Psychiatric Genomics Consortium
  • Todd Lencz
  • Itsik Pe’er

Abstract

Evidence from both GWAS and clinical observation has suggested that certain psychiatric, metabolic, and autoimmune diseases are heterogeneous, comprising multiple subtypes with distinct genomic etiologies and Polygenic Risk Scores (PRS). However, the presence of subtypes within many phenotypes is frequently unknown. We present CLiP (Correlated Liability Predictors), a method to detect heterogeneity in single GWAS cohorts. CLiP calculates a weighted sum of correlations between SNPs contributing to a PRS on the case/control liability scale. We demonstrate mathematically and through simulation that among i.i.d. homogeneous cases generated by a liability threshold model, significant anti-correlations are expected between otherwise independent predictors due to ascertainment on the hidden liability score. In the presence of heterogeneity from distinct etiologies, confounding by covariates, or mislabeling, these correlation patterns are altered predictably. We further extend our method to two additional association study designs: CLiP-X for quantitative predictors in applications such as transcriptome-wide association, and CLiP-Y for quantitative phenotypes, where there is no clear distinction between cases and controls. Through simulations, we demonstrate that CLiP and its extensions reliably distinguish between homogeneous and heterogeneous cohorts when the PRS explains as low as 3% of variance on the liability scale and cohorts comprise 50, 000 − 100, 000 samples, an increasingly practical size for modern GWAS. We apply CLiP to heterogeneity detection in schizophrenia cohorts totaling > 50, 000 cases and controls collected by the Psychiatric Genomics Consortium. We observe significant heterogeneity in mega-analysis of the combined PGC data (p-value 8.54 × 0−4), as well as in individual cohorts meta-analyzed using Fisher’s method (p-value 0.03), based on significantly associated variants. We also apply CLiP-Y to detect heterogeneity in neuroticism in over 10, 000 individuals from the UK Biobank and detect heterogeneity with a p-value of 1.68 × 10−9. Scores were not significantly reduced when partitioning by known subclusters (“Depression” and “Worry”), suggesting that these factors are not the primary source of observed heterogeneity.Author summary: Several traits, such as bipolar disease, are known to be heterogeneous and comprise distinct subtypes with unique genomic associations. For other traits such as schizophrenia, heterogeneity may be suspected, but specific subtypes are less well characterized. Furthermore, conventional mixture model-based methods to detect subtypes in genome-wide association data struggle with the high polygenicity of complex traits. We propose CLiP (Correlated Liability Predictors), a method that does not identify subtype-specific effects, but is very well-powered to detect heterogeneity of any kind within the very weak signals of GWAS. CLiP serves as a method to flag particular phenotypes for potential further study into the genomic factors driving heterogeneity, as well as a means to evaluate the transferability of polygenic risk scores. We also develop extensions of CLiP applicable to scoring heterogeneity in quantitative phenotypes and quantitative predictors such as gene expression. We apply CLiP to scoring heterogeneity in schizophrenia cohorts from the Psychiatric Genomics Consortium and neuroticism in individuals in the UK Biobank and find significant heterogeneity in both phenotypes, warranting further study.

Suggested Citation

  • Jie Yuan & Henry Xing & Alexandre Louis Lamy & The Schizophrenia Working Group of the Psychiatric Genomics Consortium & Todd Lencz & Itsik Pe’er, 2020. "Leveraging correlations between variants in polygenic risk scores to detect heterogeneity in GWAS cohorts," PLOS Genetics, Public Library of Science, vol. 16(9), pages 1-35, September.
  • Handle: RePEc:plo:pgen00:1009015
    DOI: 10.1371/journal.pgen.1009015
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1009015
    Download Restriction: no

    File URL: https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1009015&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pgen.1009015?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pgen00:1009015. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosgenetics (email available below). General contact details of provider: https://journals.plos.org/plosgenetics/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.