Author
Listed:
- Ian Roberts
- Richard G Everitt
- Jere Koskela
- Xavier Didelot
Abstract
Over the past decade, pathogen genome sequencing has become well established as a powerful approach to study infectious disease epidemiology. In particular, when multiple genomes are available from several geographical locations, comparing them is informative about the relative size of the local pathogen populations as well as past migration rates and events between locations. The structured coalescent model has a long history of being used as the underlying process for such phylogeographic analysis. However, the computational cost of using this model does not scale well to the large number of genomes frequently analysed in pathogen genomic epidemiology studies. Several approximations of the structured coalescent model have been proposed, but their effects are difficult to predict. Here we show how the exact structured coalescent model can be used to analyse a precomputed dated phylogeny, in order to perform Bayesian inference on the past migration history, the effective population sizes in each location, and the directed migration rates from any location to another. We describe an efficient reversible jump Markov Chain Monte Carlo scheme which is implemented in a new R package StructCoalescent. We use simulations to demonstrate the scalability and correctness of our method and to compare it with existing software. We also applied our new method to several state-of-the-art datasets on the population structure of real pathogens to showcase the relevance of our method to current data scales and research questions.Author summary: A virus may be present in several countries, but typically most transmission events will take place within each country, with only a relatively small number of transmission events happening from one country to another. Such structure in the pathogen population has an effect on the similarity between genomes. If the geographical structure is strong then genomes collected from the same location will be more similar on average than genomes collected from different locations. Conversely, we can reverse this principle to determine what the relationships between genomes (that we observe) implies about the pathogen population structure (that we do not observe but want to learn about). Here we present a new method to perform this task. We apply it to several simulated and real sets of pathogen genomes to reveal their underlying population structure. Knowing about pathogen population structures has important consequences for understanding the evolution and epidemiology of infectious disease pathogens, and therefore to inform the public health policies that can limit their burden.
Suggested Citation
Ian Roberts & Richard G Everitt & Jere Koskela & Xavier Didelot, 2025.
"Bayesian Inference of Pathogen Phylogeography using the Structured Coalescent Model,"
PLOS Computational Biology, Public Library of Science, vol. 21(4), pages 1-37, April.
Handle:
RePEc:plo:pcbi00:1012995
DOI: 10.1371/journal.pcbi.1012995
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1012995. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.