IDEAS home Printed from https://ideas.repec.org/a/plo/pgen00/1011537.html
   My bibliography  Save this article

Estimating evolutionary and demographic parameters via ARG-derived IBD

Author

Listed:
  • Zhendong Huang
  • Jerome Kelleher
  • Yao-ban Chan
  • David Balding

Abstract

Inference of evolutionary and demographic parameters from a sample of genome sequences often proceeds by first inferring identical-by-descent (IBD) genome segments. By exploiting efficient data encoding based on the ancestral recombination graph (ARG), we obtain three major advantages over current approaches: (i) no need to impose a length threshold on IBD segments, (ii) IBD can be defined without the hard-to-verify requirement of no recombination, and (iii) computation time can be reduced with little loss of statistical efficiency using only the IBD segments from a set of sequence pairs that scales linearly with sample size. We first demonstrate powerful inferences when true IBD information is available from simulated data. For IBD inferred from real data, we propose an approximate Bayesian computation inference algorithm and use it to show that even poorly-inferred short IBD segments can improve estimation. Our mutation-rate estimator achieves precision similar to a previously-published method despite a 4 000-fold reduction in data used for inference, and we identify significant differences between human populations. Computational cost limits model complexity in our approach, but we are able to incorporate unknown nuisance parameters and model misspecification, still finding improved parameter inference.Author summary: Samples of genome sequences can be informative about the history of the population from which they were drawn, and about mutation and other processes that led to the observed sequences. However, obtaining reliable inferences is challenging, because of the complexity of the underlying processes and the large amounts of sequence data that are often now available. A common approach to simplifying the data is to use only genome segments that are very similar between two sequences, called identical-by-descent (IBD). The longer the IBD segment the more informative it is about recent shared ancestry, and current approaches restrict attention to IBD segments above a length threshold. We instead are able to use IBD segments of any length, allowing us to extract much more information from the sequence data. To reduce the computational burden we identify subsets of the available sequence pairs that lead to little information loss. Our approach exploits recent advances in inferring the genealogical history underlying the sample of sequences. Computational cost still limits the size and complexity of problems our method can handle, but where feasible we obtain dramatic improvements in the power of inferences.

Suggested Citation

  • Zhendong Huang & Jerome Kelleher & Yao-ban Chan & David Balding, 2025. "Estimating evolutionary and demographic parameters via ARG-derived IBD," PLOS Genetics, Public Library of Science, vol. 21(1), pages 1-16, January.
  • Handle: RePEc:plo:pgen00:1011537
    DOI: 10.1371/journal.pgen.1011537
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1011537
    Download Restriction: no

    File URL: https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1011537&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pgen.1011537?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. repec:plo:pbio00:3000586 is not listed on IDEAS
    2. Matthew D Rasmussen & Melissa J Hubisz & Ilan Gronau & Adam Siepel, 2014. "Genome-Wide Inference of Ancestral Recombination Graphs," PLOS Genetics, Public Library of Science, vol. 10(5), pages 1-27, May.
    3. repec:plo:pgen00:1003905 is not listed on IDEAS
    4. Jack Kamm & Jonathan Terhorst & Richard Durbin & Yun S. Song, 2020. "Efficiently Inferring the Demographic History of Many Populations With Allele Count Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(531), pages 1472-1487, July.
    5. repec:plo:pgen00:1000695 is not listed on IDEAS
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Nicola F. Müller & Kathryn E. Kistler & Trevor Bedford, 2022. "A Bayesian approach to infer recombination patterns in coronaviruses," Nature Communications, Nature, vol. 13(1), pages 1-9, December.
    2. Romain Fournier & Zoi Tsangalidou & David Reich & Pier Francesco Palamara, 2023. "Haplotype-based inference of recent effective population size in modern and ancient DNA samples," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    3. Leonardo Vallini & Carlo Zampieri & Mohamed Javad Shoaee & Eugenio Bortolini & Giulia Marciani & Serena Aneli & Telmo Pievani & Stefano Benazzi & Alberto Barausse & Massimo Mezzavilla & Michael D. Pet, 2024. "The Persian plateau served as hub for Homo sapiens after the main out of Africa dispersal," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    4. repec:plo:pgen00:1008895 is not listed on IDEAS
    5. Legried, Brandon & Terhorst, Jonathan, 2022. "Rates of convergence in the two-island and isolation-with-migration models," Theoretical Population Biology, Elsevier, vol. 147(C), pages 16-27.
    6. Ali Mahmoudi & Jere Koskela & Jerome Kelleher & Yao-ban Chan & David Balding, 2022. "Bayesian inference of ancestral recombination graphs," PLOS Computational Biology, Public Library of Science, vol. 18(3), pages 1-15, March.
    7. Hayman, Elizabeth & Ignatieva, Anastasia & Hein, Jotun, 2023. "Recoverability of ancestral recombination graph topologies," Theoretical Population Biology, Elsevier, vol. 154(C), pages 27-39.
    8. Philippe Gambette & Leo van Iersel & Mark Jones & Manuel Lafond & Fabio Pardi & Celine Scornavacca, 2017. "Rearrangement moves on rooted phylogenetic networks," PLOS Computational Biology, Public Library of Science, vol. 13(8), pages 1-21, August.
    9. Jerome Kelleher & Alison M Etheridge & Gilean McVean, 2016. "Efficient Coalescent Simulation and Genealogical Analysis for Large Sample Sizes," PLOS Computational Biology, Public Library of Science, vol. 12(5), pages 1-22, May.
    10. Deng, Yun & Song, Yun S. & Nielsen, Rasmus, 2021. "The distribution of waiting distances in ancestral recombination graphs," Theoretical Population Biology, Elsevier, vol. 141(C), pages 34-43.
    11. repec:plo:pgen00:1008624 is not listed on IDEAS

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pgen00:1011537. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosgenetics (email available below). General contact details of provider: https://journals.plos.org/plosgenetics/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.