IDEAS home Printed from https://ideas.repec.org/a/eee/thpobi/v117y2017icp51-63.html
   My bibliography  Save this article

Variance in estimated pairwise genetic distance under high versus low coverage sequencing: The contribution of linkage disequilibrium

Author

Listed:
  • Shpak, Max
  • Ni, Yang
  • Lu, Jie
  • Müller, Peter

Abstract

The mean pairwise genetic distance among haplotypes is an estimator of the population mutation rate θ and a standard measure of variation in a population. With the advent of next-generation sequencing (NGS) methods, this and other population parameters can be estimated under different modes of sampling. One approach is to sequence individual genomes with high coverage, and to calculate genetic distance over all sample pairs. The second approach, typically used for microbial samples or for tumor cells, is sequencing a large number of pooled genomes with very low individual coverage. With low coverage, pairwise genetic distances are calculated across independently sampled sites rather than across individual genomes. In this study, we show that the variance in genetic distance estimates is reduced with low coverage sampling if the mean pairwise linkage disequilibrium weighted by allele frequencies is positive. Practically, this means that if on average the most frequent alleles over pairs of loci are in positive linkage disequilibrium, low coverage sequencing results in improved estimates of θ, assuming similar per-site read depths. We show that this result holds under the expected distribution of allele frequencies and linkage disequilibria for an infinite sites model at mutation–drift equilibrium. From simulations, we find that the conditions for reduced variance only fail to hold in cases where variant alleles are few and at very low frequency. These results are applied to haplotype frequencies from a lung cancer tumor to compute the weighted linkage disequilibria and the expected error in estimated genetic distance using high versus low coverage.

Suggested Citation

  • Shpak, Max & Ni, Yang & Lu, Jie & Müller, Peter, 2017. "Variance in estimated pairwise genetic distance under high versus low coverage sequencing: The contribution of linkage disequilibrium," Theoretical Population Biology, Elsevier, vol. 117(C), pages 51-63.
  • Handle: RePEc:eee:thpobi:v:117:y:2017:i:c:p:51-63
    DOI: 10.1016/j.tpb.2017.08.001
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0040580917300291
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.tpb.2017.08.001?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. David E. Reich & Michele Cargill & Stacey Bolk & James Ireland & Pardis C. Sabeti & Daniel J. Richter & Thomas Lavery & Rose Kouyoumjian & Shelli F. Farhadian & Ryk Ward & Eric S. Lander, 2001. "Linkage disequilibrium in the human genome," Nature, Nature, vol. 411(6834), pages 199-204, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Freitas, Osmar & Araujo, Sabrina B.L. & Campos, Paulo R.A., 2022. "Speciation in a metapopulation model upon environmental changes," Ecological Modelling, Elsevier, vol. 468(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Pietro Biroli & Titus Galama & Stephanie von Hinke & Hans van Kippersluis & Kevin Thom, 2022. "Economics and Econometrics of Gene-Environment Interplay," Bristol Economics Discussion Papers 22/759, School of Economics, University of Bristol, UK.
    2. Chung-Feng Kao & Jia-Rou Liu & Hung Hung & Po-Hsiu Kuo, 2015. "A Robust GWSS Method to Simultaneously Detect Rare and Common Variants for Complex Disease," PLOS ONE, Public Library of Science, vol. 10(4), pages 1-14, April.
    3. Haipeng Li & Thomas Wiehe, 2013. "Coalescent Tree Imbalance and a Simple Test for Selective Sweeps Based on Microsatellite Variation," PLOS Computational Biology, Public Library of Science, vol. 9(5), pages 1-14, May.
    4. Shuxia Guo & Yunhua Hu & Yusong Ding & Jiaming Liu & Mei Zhang & Rulin Ma & Heng Guo & Kui Wang & Jia He & Yizhong Yan & Dongsheng Rui & Feng Sun & Lati Mu & Qiang Niu & Jingyu Zhang & Shugang Li, 2015. "Association between Eight Functional Polymorphisms and Haplotypes in the Cholesterol Ester Transfer Protein (CETP) Gene and Dyslipidemia in National Minority Adults in the Far West Region of China," IJERPH, MDPI, vol. 12(12), pages 1-14, December.
    5. Li Qin & Wu Rongling, 2009. "A Multilocus Model for Constructing a Linkage Disequilibrium Map in Human Populations," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 8(1), pages 1-25, February.
    6. Zhang, Hong & Wu, Zheyang, 2022. "The general goodness-of-fit tests for correlated data," Computational Statistics & Data Analysis, Elsevier, vol. 167(C).
    7. Wei Zhao & Erin B. Ware & Zihuai He & Sharon L. R. Kardia & Jessica D. Faul & Jennifer A. Smith, 2017. "Interaction between Social/Psychosocial Factors and Genetic Variants on Body Mass Index: A Gene-Environment Interaction Analysis in a Longitudinal Setting," IJERPH, MDPI, vol. 14(10), pages 1-17, September.
    8. Hou, Wei & Liu, Tian & Li, Yao & Li, Qin & Li, Jiahan & Das, Kiranmoy & Berg, Arthur & Wu, Rongling, 2009. "Multilocus genomics of outcrossing plant populations," Theoretical Population Biology, Elsevier, vol. 76(1), pages 68-76.
    9. Xiaoshuai Zhang & Xiaowei Yang & Zhongshang Yuan & Yanxun Liu & Fangyu Li & Bin Peng & Dianwen Zhu & Jinghua Zhao & Fuzhong Xue, 2013. "A PLSPM-Based Test Statistic for Detecting Gene-Gene Co-Association in Genome-Wide Association Study with Case-Control Design," PLOS ONE, Public Library of Science, vol. 8(4), pages 1-8, April.
    10. Kari E. North & Lisa J. Martin, 2008. "The Importance of Gene—Environment Interaction," Sociological Methods & Research, , vol. 37(2), pages 164-200, November.
    11. Hu, Xin-Sheng & Hu, Yang & Chen, Xiaoyang, 2016. "Testing neutrality at copy-number-variable loci under the finite-allele and finite-site models," Theoretical Population Biology, Elsevier, vol. 112(C), pages 1-13.
    12. Shuo Jiao & Li Hsu & Sonja Berndt & Stéphane Bézieau & Hermann Brenner & Daniel Buchanan & Bette J Caan & Peter T Campbell & Christopher S Carlson & Graham Casey & Andrew T Chan & Jenny Chang-Claude &, 2012. "Genome-Wide Search for Gene-Gene Interactions in Colorectal Cancer," PLOS ONE, Public Library of Science, vol. 7(12), pages 1-14, December.
    13. Konstantin Schildknecht & Sven Olek & Thorsten Dickhaus, 2015. "Simultaneous Statistical Inference for Epigenetic Data," PLOS ONE, Public Library of Science, vol. 10(5), pages 1-15, May.
    14. Zhuling Yu & Wei Li & Deren Hou & Lin Zhou & Yanyao Deng & Mi Tian & Xialu Feng, 2015. "Relationship between Adiponectin Gene Polymorphisms and Late-Onset Alzheimer’s Disease," PLOS ONE, Public Library of Science, vol. 10(4), pages 1-11, April.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:thpobi:v:117:y:2017:i:c:p:51-63. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: https://www.journals.elsevier.com/intelligence .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.