IDEAS home Printed from https://ideas.repec.org/a/plo/pgen00/1002967.html
   My bibliography  Save this article

Inference of Population Splits and Mixtures from Genome-Wide Allele Frequency Data

Author

Listed:
  • Joseph K Pickrell
  • Jonathan K Pritchard

Abstract

Many aspects of the historical relationships between populations in a species are reflected in genetic data. Inferring these relationships from genetic data, however, remains a challenging task. In this paper, we present a statistical model for inferring the patterns of population splits and mixtures in multiple populations. In our model, the sampled populations in a species are related to their common ancestor through a graph of ancestral populations. Using genome-wide allele frequency data and a Gaussian approximation to genetic drift, we infer the structure of this graph. We applied this method to a set of 55 human populations and a set of 82 dog breeds and wild canids. In both species, we show that a simple bifurcating tree does not fully describe the data; in contrast, we infer many migration events. While some of the migration events that we find have been detected previously, many have not. For example, in the human data, we infer that Cambodians trace approximately 16% of their ancestry to a population ancestral to other extant East Asian populations. In the dog data, we infer that both the boxer and basenji trace a considerable fraction of their ancestry (9% and 25%, respectively) to wolves subsequent to domestication and that East Asian toy breeds (the Shih Tzu and the Pekingese) result from admixture between modern toy breeds and “ancient” Asian breeds. Software implementing the model described here, called TreeMix, is available at http://treemix.googlecode.com. Author Summary: With modern genotyping technology, it is now possible to obtain large amounts of genetic data from many populations in a species. An important question that can be addressed with these data is: what is the history of these populations? There is a long history in population genetics of inferring the relationships among populations as a bifurcating tree, analogous to phylogenetic trees for representing the evolution of species. However, it has long been recognized that, since populations from the same species exchange genes, simple bifurcating trees may be an incorrect representation of population histories. We have developed a method to address this issue, using a model which allows for both population splits and gene flow. In application to humans, we show that we are able to identify a number of both previously known and unknown episodes of gene flow in history, including gene flow into Cambodia of a population only distantly related to modern East Asia. In application to dogs, we show that the boxer and basenji breeds have a considerable component of ancestry from grey wolves subsequent to domestication.

Suggested Citation

  • Joseph K Pickrell & Jonathan K Pritchard, 2012. "Inference of Population Splits and Mixtures from Genome-Wide Allele Frequency Data," PLOS Genetics, Public Library of Science, vol. 8(11), pages 1-17, November.
  • Handle: RePEc:plo:pgen00:1002967
    DOI: 10.1371/journal.pgen.1002967
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1002967
    Download Restriction: no

    File URL: https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1002967&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pgen.1002967?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pgen00:1002967. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosgenetics (email available below). General contact details of provider: https://journals.plos.org/plosgenetics/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.