IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1005788.html
   My bibliography  Save this article

A quadratically regularized functional canonical correlation analysis for identifying the global structure of pleiotropy with NGS data

Author

Listed:
  • Nan Lin
  • Yun Zhu
  • Ruzong Fan
  • Momiao Xiong

Abstract

Investigating the pleiotropic effects of genetic variants can increase statistical power, provide important information to achieve deep understanding of the complex genetic structures of disease, and offer powerful tools for designing effective treatments with fewer side effects. However, the current multiple phenotype association analysis paradigm lacks breadth (number of phenotypes and genetic variants jointly analyzed at the same time) and depth (hierarchical structure of phenotype and genotypes). A key issue for high dimensional pleiotropic analysis is to effectively extract informative internal representation and features from high dimensional genotype and phenotype data. To explore correlation information of genetic variants, effectively reduce data dimensions, and overcome critical barriers in advancing the development of novel statistical methods and computational algorithms for genetic pleiotropic analysis, we proposed a new statistic method referred to as a quadratically regularized functional CCA (QRFCCA) for association analysis which combines three approaches: (1) quadratically regularized matrix factorization, (2) functional data analysis and (3) canonical correlation analysis (CCA). Large-scale simulations show that the QRFCCA has a much higher power than that of the ten competing statistics while retaining the appropriate type 1 errors. To further evaluate performance, the QRFCCA and ten other statistics are applied to the whole genome sequencing dataset from the TwinsUK study. We identify a total of 79 genes with rare variants and 67 genes with common variants significantly associated with the 46 traits using QRFCCA. The results show that the QRFCCA substantially outperforms the ten other statistics.Author summary: Association analysis of multiple phenotypes will unravel the genetic pleiotropic structures of multiple phenotypes, provide a powerful tool for developing drug with fewer side effects. To increase the power of the tests for high dimensional association analysis of multiple phenotypes with next-generation sequencing data, a key issue is to develop novel statistics that can effectively extract informative internal representation and features from high dimensional data. However, the current paradigm of association analysis of multiple phenotypes does not efficiently utilize the rich correlation structure of the genotype and phenotype data. To shift the paradigm of association analysis from shallow multivariate analysis to comprehensive functional analysis, we proposed a new general statistical framework referred to as a quadratically regularized functional canonical correlation analysis (QRFCCA) for association test which explores rich correlation information in the genotype and phenotype data. Large-scale simulations demonstrate that the QRFCCA has a much higher power than that of the many existing statistics while retaining the appropriate type 1 errors. To further evaluate the new approach, the QRFCCA are also applied to the TwinsUK study with 46 traits and sequencing data. The results show that the QRFCCA substantially outperforms the other statistics.

Suggested Citation

  • Nan Lin & Yun Zhu & Ruzong Fan & Momiao Xiong, 2017. "A quadratically regularized functional canonical correlation analysis for identifying the global structure of pleiotropy with NGS data," PLOS Computational Biology, Public Library of Science, vol. 13(10), pages 1-33, October.
  • Handle: RePEc:plo:pcbi00:1005788
    DOI: 10.1371/journal.pcbi.1005788
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005788
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1005788&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1005788?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Jose A Seoane & Colin Campbell & Ian N M Day & Juan P Casas & Tom R Gaunt, 2014. "Canonical Correlation Analysis for Gene-Based Pleiotropy Discovery," PLOS Computational Biology, Public Library of Science, vol. 10(10), pages 1-13, October.
    2. Paul F O’Reilly & Clive J Hoggart & Yotsawat Pomyen & Federico C F Calboli & Paul Elliott & Marjo-Riitta Jarvelin & Lachlan J M Coin, 2012. "MultiPhen: Joint Model of Multiple Phenotypes Can Increase Discovery in GWAS," PLOS ONE, Public Library of Science, vol. 7(5), pages 1-1, May.
    3. Matthew Stephens, 2013. "A Unified Framework for Association Analysis with Multiple Related Phenotypes," PLOS ONE, Public Library of Science, vol. 8(7), pages 1-19, July.
    4. Qiong Yang & Yuanjia Wang, 2012. "Methods for Analyzing Multivariate Phenotypes in Genetic Association Studies," Journal of Probability and Statistics, Hindawi, vol. 2012, pages 1-13, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Heejung Shim & Daniel I Chasman & Joshua D Smith & Samia Mora & Paul M Ridker & Deborah A Nickerson & Ronald M Krauss & Matthew Stephens, 2015. "A Multivariate Genome-Wide Association Analysis of 10 LDL Subfractions, and Their Response to Statin Treatment, in 1868 Caucasians," PLOS ONE, Public Library of Science, vol. 10(4), pages 1-20, April.
    2. Zhenchuan Wang & Qiuying Sha & Shurong Fang & Kui Zhang & Shuanglin Zhang, 2018. "Testing an optimally weighted combination of common and/or rare variants with multiple traits," PLOS ONE, Public Library of Science, vol. 13(7), pages 1-16, July.
    3. Zihuai He & Erin K Payne & Bhramar Mukherjee & Seunggeun Lee & Jennifer A Smith & Erin B Ware & Brisa N Sánchez & Teresa E Seeman & Sharon L R Kardia & Ana V Diez Roux, 2015. "Association between Stress Response Genes and Features of Diurnal Cortisol Curves in the Multi-Ethnic Study of Atherosclerosis: A New Multi-Phenotype Approach for Gene-Based Association Tests," PLOS ONE, Public Library of Science, vol. 10(5), pages 1-15, May.
    4. Zhenchuan Wang & Qiuying Sha & Shuanglin Zhang, 2016. "Joint Analysis of Multiple Traits Using "Optimal" Maximum Heritability Test," PLOS ONE, Public Library of Science, vol. 11(3), pages 1-12, March.
    5. Kai Wang, 2014. "Testing Genetic Association by Regressing Genotype over Multiple Phenotypes," PLOS ONE, Public Library of Science, vol. 9(9), pages 1-9, September.
    6. Jose A Seoane & Colin Campbell & Ian N M Day & Juan P Casas & Tom R Gaunt, 2014. "Canonical Correlation Analysis for Gene-Based Pleiotropy Discovery," PLOS Computational Biology, Public Library of Science, vol. 10(10), pages 1-13, October.
    7. Huanhuan Zhu & Shuanglin Zhang & Qiuying Sha, 2018. "A novel method to test associations between a weighted combination of phenotypes and genetic variants," PLOS ONE, Public Library of Science, vol. 13(1), pages 1-17, January.
    8. Jianjun Zhang & Qiuying Sha & Guanfu Liu & Xuexia Wang, 2019. "A gene based approach to test genetic association based on an optimally weighted combination of multiple traits," PLOS ONE, Public Library of Science, vol. 14(8), pages 1-17, August.
    9. Xue Yuan & Zhang Sanguo & Wang Jinjuan & Ding Juan & Li Qizhai, 2019. "A powerful test for ordinal trait genetic association analysis," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 18(2), pages 1-9, April.
    10. Dennis Meer & Oleksandr Frei & Tobias Kaufmann & Alexey A. Shadrin & Anna Devor & Olav B. Smeland & Wesley K. Thompson & Chun Chieh Fan & Dominic Holland & Lars T. Westlye & Ole A. Andreassen & Anders, 2020. "Understanding the genetic determinants of the brain with MOSTest," Nature Communications, Nature, vol. 11(1), pages 1-9, December.
    11. Yang, Chiao-Yu & Lei, Lihua & Ho, Nhat & Fithian, William, 2022. "BONuS: Multiple Multivariate Testing with a Data-Adaptive Test Statistic," Research Papers 4031, Stanford University, Graduate School of Business.
    12. Lin Zhang & Lei Sun, 2022. "A generalized robust allele‐based genetic association test," Biometrics, The International Biometric Society, vol. 78(2), pages 487-498, June.
    13. Gao Wang & Abhishek Sarkar & Peter Carbonetto & Matthew Stephens, 2020. "A simple new approach to variable selection in regression, with application to genetic fine mapping," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 82(5), pages 1273-1300, December.
    14. Young Lee & Suyeon Park & Sanghoon Moon & Juyoung Lee & Robert C. Elston & Woojoo Lee & Sungho Won, 2014. "On the Analysis of a Repeated Measure Design in Genome-Wide Association Analysis," IJERPH, MDPI, vol. 11(12), pages 1-21, November.
    15. Michael C Turchin & Matthew Stephens, 2019. "Bayesian multivariate reanalysis of large genetic studies identifies many new associations," PLOS Genetics, Public Library of Science, vol. 15(10), pages 1-18, October.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1005788. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.