IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0131106.html
   My bibliography  Save this article

Methodological Considerations in Estimation of Phenotype Heritability Using Genome-Wide SNP Data, Illustrated by an Analysis of the Heritability of Height in a Large Sample of African Ancestry Adults

Author

Listed:
  • Fang Chen
  • Jing He
  • Jianqi Zhang
  • Gary K Chen
  • Venetta Thomas
  • Christine B Ambrosone
  • Elisa V Bandera
  • Sonja I Berndt
  • Leslie Bernstein
  • William J Blot
  • Qiuyin Cai
  • John Carpten
  • Graham Casey
  • Stephen J Chanock
  • Iona Cheng
  • Lisa Chu
  • Sandra L Deming
  • W Ryan Driver
  • Phyllis Goodman
  • Richard B Hayes
  • Anselm J M Hennis
  • Ann W Hsing
  • Jennifer J Hu
  • Sue A Ingles
  • Esther M John
  • Rick A Kittles
  • Suzanne Kolb
  • M Cristina Leske
  • Robert C Millikan
  • Kristine R Monroe
  • Adam Murphy
  • Barbara Nemesure
  • Christine Neslund-Dudas
  • Sarah Nyante
  • Elaine A Ostrander
  • Michael F Press
  • Jorge L Rodriguez-Gil
  • Ben A Rybicki
  • Fredrick Schumacher
  • Janet L Stanford
  • Lisa B Signorello
  • Sara S Strom
  • Victoria Stevens
  • David Van Den Berg
  • Zhaoming Wang
  • John S Witte
  • Suh-Yuh Wu
  • Yuko Yamamura
  • Wei Zheng
  • Regina G Ziegler
  • Alexander H Stram
  • Laurence N Kolonel
  • Loïc Le Marchand
  • Brian E Henderson
  • Christopher A Haiman
  • Daniel O Stram

Abstract

Height has an extremely polygenic pattern of inheritance. Genome-wide association studies (GWAS) have revealed hundreds of common variants that are associated with human height at genome-wide levels of significance. However, only a small fraction of phenotypic variation can be explained by the aggregate of these common variants. In a large study of African-American men and women (n = 14,419), we genotyped and analyzed 966,578 autosomal SNPs across the entire genome using a linear mixed model variance components approach implemented in the program GCTA (Yang et al Nat Genet 2010), and estimated an additive heritability of 44.7% (se: 3.7%) for this phenotype in a sample of evidently unrelated individuals. While this estimated value is similar to that given by Yang et al in their analyses, we remain concerned about two related issues: (1) whether in the complete absence of hidden relatedness, variance components methods have adequate power to estimate heritability when a very large number of SNPs are used in the analysis; and (2) whether estimation of heritability may be biased, in real studies, by low levels of residual hidden relatedness. We addressed the first question in a semi-analytic fashion by directly simulating the distribution of the score statistic for a test of zero heritability with and without low levels of relatedness. The second question was addressed by a very careful comparison of the behavior of estimated heritability for both observed (self-reported) height and simulated phenotypes compared to imputation R2 as a function of the number of SNPs used in the analysis. These simulations help to address the important question about whether today's GWAS SNPs will remain useful for imputing causal variants that are discovered using very large sample sizes in future studies of height, or whether the causal variants themselves will need to be genotyped de novo in order to build a prediction model that ultimately captures a large fraction of the variability of height, and by implication other complex phenotypes. Our overall conclusions are that when study sizes are quite large (5,000 or so) the additive heritability estimate for height is not apparently biased upwards using the linear mixed model; however there is evidence in our simulation that a very large number of causal variants (many thousands) each with very small effect on phenotypic variance will need to be discovered to fill the gap between the heritability explained by known versus unknown causal variants. We conclude that today's GWAS data will remain useful in the future for causal variant prediction, but that finding the causal variants that need to be predicted may be extremely laborious.

Suggested Citation

  • Fang Chen & Jing He & Jianqi Zhang & Gary K Chen & Venetta Thomas & Christine B Ambrosone & Elisa V Bandera & Sonja I Berndt & Leslie Bernstein & William J Blot & Qiuyin Cai & John Carpten & Graham Ca, 2015. "Methodological Considerations in Estimation of Phenotype Heritability Using Genome-Wide SNP Data, Illustrated by an Analysis of the Heritability of Height in a Large Sample of African Ancestry Adults," PLOS ONE, Public Library of Science, vol. 10(6), pages 1-17, June.
  • Handle: RePEc:plo:pone00:0131106
    DOI: 10.1371/journal.pone.0131106
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0131106
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0131106&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0131106?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Dawei Liu & Xihong Lin & Debashis Ghosh, 2007. "Semiparametric Regression of Multidimensional Genetic Pathway Data: Least-Squares Kernel Machines and Linear Mixed Models," Biometrics, The International Biometric Society, vol. 63(4), pages 1079-1088, December.
    2. Bryan N Howie & Peter Donnelly & Jonathan Marchini, 2009. "A Flexible and Accurate Genotype Imputation Method for the Next Generation of Genome-Wide Association Studies," PLOS Genetics, Public Library of Science, vol. 5(6), pages 1-15, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Daniel Svensson & Matilda Rentoft & Anna M Dahlin & Emma Lundholm & Pall I Olason & Andreas Sjödin & Carin Nylander & Beatrice S Melin & Johan Trygg & Erik Johansson, 2020. "A whole-genome sequenced control population in northern Sweden reveals subregional genetic differences," PLOS ONE, Public Library of Science, vol. 15(9), pages 1-18, September.
    2. Chuan Gao & Nan Wang & Xiuqing Guo & Julie T Ziegler & Kent D Taylor & Anny H Xiang & Yang Hai & Steven J Kridel & Jerry L Nadler & Fouad Kandeel & Leslie J Raffel & Yii-Der I Chen & Jill M Norris & J, 2015. "A Comprehensive Analysis of Common and Rare Variants to Identify Adiposity Loci in Hispanic Americans: The IRAS Family Study (IRASFS)," PLOS ONE, Public Library of Science, vol. 10(11), pages 1-17, November.
    3. Zaili Fang & Inyoung Kim & Jeesun Jung, 2018. "Semiparametric Kernel-Based Regression for Evaluating Interaction Between Pathway Effect and Covariate," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 23(1), pages 129-152, March.
    4. Arnab Maity & Xihong Lin, 2011. "Powerful Tests for Detecting a Gene Effect in the Presence of Possible Gene–Gene Interactions Using Garrote Kernel Machines," Biometrics, The International Biometric Society, vol. 67(4), pages 1271-1284, December.
    5. Paul S de Vries & Maria Sabater-Lleal & Daniel I Chasman & Stella Trompet & Tarunveer S Ahluwalia & Alexander Teumer & Marcus E Kleber & Ming-Huei Chen & Jie Jin Wang & John R Attia & Riccardo E Mario, 2017. "Comparison of HapMap and 1000 Genomes Reference Panels in a Large-Scale Genome-Wide Association Study," PLOS ONE, Public Library of Science, vol. 12(1), pages 1-22, January.
    6. Bo Jiang & Jun S. Liu, 2015. "Bayesian Partition Models for Identifying Expression Quantitative Trait Loci," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(512), pages 1350-1361, December.
    7. Rakesh Chettier & Lesa Nelson & James W Ogilvie & Hans M Albertsen & Kenneth Ward, 2015. "Haplotypes at LBX1 Have Distinct Inheritance Patterns with Opposite Effects in Adolescent Idiopathic Scoliosis," PLOS ONE, Public Library of Science, vol. 10(2), pages 1-11, February.
    8. Long Qu & Tobias Guennel & Scott L. Marshall, 2013. "Linear Score Tests for Variance Components in Linear Mixed Models and Applications to Genetic Association Studies," Biometrics, The International Biometric Society, vol. 69(4), pages 883-892, December.
    9. Michel S. Naslavsky & Marilia O. Scliar & Guilherme L. Yamamoto & Jaqueline Yu Ting Wang & Stepanka Zverinova & Tatiana Karp & Kelly Nunes & José Ricardo Magliocco Ceroni & Diego Lima Carvalho & Carlo, 2022. "Whole-genome sequencing of 1,171 elderly admixed individuals from Brazil," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    10. Steinrücken, Matthias & Paul, Joshua S. & Song, Yun S., 2013. "A sequentially Markov conditional sampling distribution for structured populations with migration and recombination," Theoretical Population Biology, Elsevier, vol. 87(C), pages 51-61.
    11. Teran Hidalgo, Sebastian J. & Wu, Michael C. & Engel, Stephanie M. & Kosorok, Michael R., 2018. "Goodness-of-fit test for nonparametric regression models: Smoothing spline ANOVA models as example," Computational Statistics & Data Analysis, Elsevier, vol. 122(C), pages 135-155.
    12. Lin Zhang & Inyoung Kim, 2021. "Finite mixtures of semiparametric Bayesian survival kernel machine regressions: Application to breast cancer gene pathway subgroup analysis," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 70(2), pages 251-269, March.
    13. Anshuman Sewda & A J Agopian & Elizabeth Goldmuntz & Hakon Hakonarson & Bernice E Morrow & Fadi Musfee & Deanne Taylor & Laura E Mitchell & on behalf of the Pediatric Cardiac Genomics Consortium, 2020. "Gene-based analyses of the maternal genome implicate maternal effect genes as risk factors for conotruncal heart defects," PLOS ONE, Public Library of Science, vol. 15(6), pages 1-15, June.
    14. Lin Yuan & Chang-An Yuan & De-Shuang Huang, 2017. "FAACOSE: A Fast Adaptive Ant Colony Optimization Algorithm for Detecting SNP Epistasis," Complexity, Hindawi, vol. 2017, pages 1-10, September.
    15. Chakraborty, Sounak, 2009. "Bayesian binary kernel probit model for microarray based cancer classification and gene selection," Computational Statistics & Data Analysis, Elsevier, vol. 53(12), pages 4198-4209, October.
    16. Carl Nettelblad, 2013. "Breakdown of Methods for Phasing and Imputation in the Presence of Double Genotype Sharing," PLOS ONE, Public Library of Science, vol. 8(3), pages 1-5, March.
    17. Viinikainen, Jutta & Bryson, Alex & Böckerman, Petri & Kari, Jaana T. & Lehtimäki, Terho & Raitakari, Olli & Viikari, Jorma & Pehkonen, Jaakko, 2022. "Does better education mitigate risky health behavior? A mendelian randomization study," Economics & Human Biology, Elsevier, vol. 46(C).
    18. Cavin K Ward-Caviness & Paul S de Vries & Kerri L Wiggins & Jennifer E Huffman & Lisa R Yanek & Lawrence F Bielak & Franco Giulianini & Xiuqing Guo & Marcus E Kleber & Tim Kacprowski & Stefan Groß & A, 2019. "Mendelian randomization evaluation of causal effects of fibrinogen on incident coronary heart disease," PLOS ONE, Public Library of Science, vol. 14(5), pages 1-18, May.
    19. Yunxuan Jiang & Karen N. Conneely & Michael P. Epstein, 2018. "Robust Rare-Variant Association Tests for Quantitative Traits in General Pedigrees," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 10(3), pages 491-505, December.
    20. Glen McGee & Ander Wilson & Thomas F. Webster & Brent A. Coull, 2023. "Bayesian multiple index models for environmental mixtures," Biometrics, The International Biometric Society, vol. 79(1), pages 462-474, March.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0131106. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.