IDEAS home Printed from https://ideas.repec.org/a/plo/pgen00/1006529.html
   My bibliography  Save this article

A Hidden Markov Model Approach for Simultaneously Estimating Local Ancestry and Admixture Time Using Next Generation Sequence Data in Samples of Arbitrary Ploidy

Author

Listed:
  • Russell Corbett-Detig
  • Rasmus Nielsen

Abstract

Admixture—the mixing of genomes from divergent populations—is increasingly appreciated as a central process in evolution. To characterize and quantify patterns of admixture across the genome, a number of methods have been developed for local ancestry inference. However, existing approaches have a number of shortcomings. First, all local ancestry inference methods require some prior assumption about the expected ancestry tract lengths. Second, existing methods generally require genotypes, which is not feasible to obtain for many next-generation sequencing projects. Third, many methods assume samples are diploid, however a wide variety of sequencing applications will fail to meet this assumption. To address these issues, we introduce a novel hidden Markov model for estimating local ancestry that models the read pileup data, rather than genotypes, is generalized to arbitrary ploidy, and can estimate the time since admixture during local ancestry inference. We demonstrate that our method can simultaneously estimate the time since admixture and local ancestry with good accuracy, and that it performs well on samples of high ploidy—i.e. 100 or more chromosomes. As this method is very general, we expect it will be useful for local ancestry inference in a wider variety of populations than what previously has been possible. We then applied our method to pooled sequencing data derived from populations of Drosophila melanogaster on an ancestry cline on the east coast of North America. We find that regions of local recombination rates are negatively correlated with the proportion of African ancestry, suggesting that selection against foreign ancestry is the least efficient in low recombination regions. Finally we show that clinal outlier loci are enriched for genes associated with gene regulatory functions, consistent with a role of regulatory evolution in ecological adaptation of admixed D. melanogaster populations. Our results illustrate the potential of local ancestry inference for elucidating fundamental evolutionary processes.Author Summary: When divergent populations hybridize, their offspring obtain portions of their genomes from each parent population. Although the average ancestry proportion in each descendant is equal to the proportion of ancestors from each of the ancestral populations, the contribution of each ancestry type is variable across the genome. Estimating local ancestry within admixed individuals is a fundamental goal for evolutionary genetics, and here we develop a method for doing this that circumvents many of the problems associated with existing methods. Briefly, our method can use short read data, rather than genotypes and can be applied to samples with any number of chromosomes. Furthermore, our method simultaneously estimates local ancestry and the number of generations since admixture—the time that the two ancestral populations first encountered each other. Finally, in applying our method to data from an admixture zone between ancestral populations of Drosophila melanogaster, we find many lines of evidence consistent with natural selection operating to against the introduction of foreign ancestry into populations of one predominant ancestry type. Because of the generality of this method, we expect that it will be useful for a wide variety of existing and ongoing research projects.

Suggested Citation

  • Russell Corbett-Detig & Rasmus Nielsen, 2017. "A Hidden Markov Model Approach for Simultaneously Estimating Local Ancestry and Admixture Time Using Next Generation Sequence Data in Samples of Arbitrary Ploidy," PLOS Genetics, Public Library of Science, vol. 13(1), pages 1-40, January.
  • Handle: RePEc:plo:pgen00:1006529
    DOI: 10.1371/journal.pgen.1006529
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1006529
    Download Restriction: no

    File URL: https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1006529&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pgen.1006529?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Sriram Sankararaman & Swapan Mallick & Michael Dannemann & Kay Prüfer & Janet Kelso & Svante Pääbo & Nick Patterson & David Reich, 2014. "The genomic landscape of Neanderthal ancestry in present-day humans," Nature, Nature, vol. 507(7492), pages 354-357, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Lokman Galal & Frédéric Ariey & Meriadeg Ar Gouilh & Marie-Laure Dardé & Azra Hamidović & Franck Letourneur & Franck Prugnolle & Aurélien Mercier, 2022. "A unique Toxoplasma gondii haplotype accompanied the global expansion of cats," Nature Communications, Nature, vol. 13(1), pages 1-13, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Kai Yuan & Xumin Ni & Chang Liu & Yuwen Pan & Lian Deng & Rui Zhang & Yang Gao & Xueling Ge & Jiaojiao Liu & Xixian Ma & Haiyi Lou & Taoyang Wu & Shuhua Xu, 2021. "Refining models of archaic admixture in Eurasia with ArchaicSeeker 2.0," Nature Communications, Nature, vol. 12(1), pages 1-15, December.
    2. Leonardo Vallini & Carlo Zampieri & Mohamed Javad Shoaee & Eugenio Bortolini & Giulia Marciani & Serena Aneli & Telmo Pievani & Stefano Benazzi & Alberto Barausse & Massimo Mezzavilla & Michael D. Pet, 2024. "The Persian plateau served as hub for Homo sapiens after the main out of Africa dispersal," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    3. Mark S Hibbins & Matthew W Hahn, 2021. "The effects of introgression across thousands of quantitative traits revealed by gene expression in wild tomatoes," PLOS Genetics, Public Library of Science, vol. 17(11), pages 1-20, November.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pgen00:1006529. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosgenetics (email available below). General contact details of provider: https://journals.plos.org/plosgenetics/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.