IDEAS home Printed from https://ideas.repec.org/a/plo/pgen00/1005271.html
   My bibliography  Save this article

Leveraging Identity-by-Descent for Accurate Genotype Inference in Family Sequencing Data

Author

Listed:
  • Bingshan Li
  • Qiang Wei
  • Xiaowei Zhan
  • Xue Zhong
  • Wei Chen
  • Chun Li
  • Jonathan Haines

Abstract

Sequencing family DNA samples provides an attractive alternative to population based designs to identify rare variants associated with human disease due to the enrichment of causal variants in pedigrees. Previous studies showed that genotype calling accuracy can be improved by modeling family relatedness compared to standard calling algorithms. Current family-based variant calling methods use sequencing data on single variants and ignore the identity-by-descent (IBD) sharing along the genome. In this study we describe a new computational framework to accurately estimate the IBD sharing from the sequencing data, and to utilize the inferred IBD among family members to jointly call genotypes in pedigrees. Through simulations and application to real data, we showed that IBD can be reliably estimated across the genome, even at very low coverage (e.g. 2X), and genotype accuracy can be dramatically improved. Moreover, the improvement is more pronounced for variants with low frequencies, especially at low to intermediate coverage (e.g. 10X to 20X), making our approach effective in studying rare variants in cost-effective whole genome sequencing in pedigrees. We hope that our tool is useful to the research community for identifying rare variants for human disease through family-based sequencing.Author Summary: To identify disease variants that occur less frequently in population, sequencing families in which multiple individuals are affected is more powerful due to the enrichment of causal variants. An important step in such studies is to infer individual genotypes from sequencing data. Existing methods do not utilize full familial transmission information and therefore result in reduced accuracy of inferred genotypes. In this study we describe a new method that infers shared genetic materials among family members and then incorporate the shared genomic information in a novel algorithm that can accurately infer genotypes. Our method is particularly advantageous when inferring low frequency variants with fewer sequence data, making it effective in analyzing genome-wide sequence data. We implemented the algorithm in a computationally efficient tool to facilitate cost-effective sequencing in families for identifying disease genetic variants.

Suggested Citation

  • Bingshan Li & Qiang Wei & Xiaowei Zhan & Xue Zhong & Wei Chen & Chun Li & Jonathan Haines, 2015. "Leveraging Identity-by-Descent for Accurate Genotype Inference in Family Sequencing Data," PLOS Genetics, Public Library of Science, vol. 11(6), pages 1-19, June.
  • Handle: RePEc:plo:pgen00:1005271
    DOI: 10.1371/journal.pgen.1005271
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1005271
    Download Restriction: no

    File URL: https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1005271&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pgen.1005271?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pgen00:1005271. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosgenetics (email available below). General contact details of provider: https://journals.plos.org/plosgenetics/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.