IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1013749.html
   My bibliography  Save this article

Unsupervised detection and fitness estimation of emerging SARS-CoV-2 variants: Application to wastewater samples (ANRS0160)

Author

Listed:
  • Alexandra Lefebvre
  • Vincent Maréchal
  • Arnaud Gloaguen
  • The Obépine Consortium
  • Amaury Lambert
  • Yvon Maday

Abstract

Repeated waves of emerging variants during the SARS-CoV-2 pandemics have highlighted the urge of collecting longitudinal genomic data and developing statistical methods based on time series analyses for detecting new threatening lineages and estimating their fitness early in time. Most models study the evolution of the prevalence of particular lineages over time and require a prior classification of sequences into lineages which is prone to induce delays and biases. More recently, several authors studied the evolution of the prevalence of mutations over time with alternative clustering approaches, avoiding specific lineage classification. Most existing methods are either non parametric or unsuited to pooled data characterizing, for instance, wastewater samples. The analysis of wastewater samples has recently been pointed out as a valuable complementary approach to clinical sample analysis, however the pooled nature of the data involves specific statistical challenges. In this context, we propose an alternative unsupervised method for clustering mutations according to their frequency trajectory over time and estimating group fitness from time series of pooled mutation prevalence data. Our model is a mixture of observed count data and latent group assignment and we use the expectation-maximization algorithm for model selection and parameter estimation. The application of our method to time series of SARS-CoV-2 sequencing data collected from wastewater treatment plants in France from October 2020 to April 2021 shows its ability to agnostically group mutations in a consistent way with lineages B.1.160, Alpha, B.1.177, Beta, and with selection coefficient estimates per group in coherence with the viral dynamics in France reported by Nextstrain. Moreover, our method detected the Alpha variant as threatening as early as supervised methods (which track specific mutations over time) with the noticeable difference that, since unsupervised, it does not require any prior information on the set of mutations.Author summary: The SARS-CoV-2 pandemics has been characterized by successive waves of emerging variants replacing previously dominant ones. A variant is characterized by a combination of mutations, with some mutations possibly shared among variant relatives. The early detection of emerging variants is of great importance in order to adapt public health responses to viral evolution. Wastewater surveillance has been highlighted as a valuable complementary approach to clinical sample analysis mostly because it is representative of the viral circulation at a population level. Indeed, all infected individuals, wether symptomatic or not, contribute to wastewater samples. Wastewater surveillance however is subject to some statistical challenges as the viral genetic material is highly fragmented, incomplete and comes from multiple individuals. In this work we propose a method, suited for wastewater samples, grouping viral mutations according to their frequency trajectory through time in an agnostic manner and we detect threatening variants without prior knowledge on their characteristic mutations as early as methods targeting known specific mutations.

Suggested Citation

  • Alexandra Lefebvre & Vincent Maréchal & Arnaud Gloaguen & The Obépine Consortium & Amaury Lambert & Yvon Maday, 2025. "Unsupervised detection and fitness estimation of emerging SARS-CoV-2 variants: Application to wastewater samples (ANRS0160)," PLOS Computational Biology, Public Library of Science, vol. 21(12), pages 1-22, December.
  • Handle: RePEc:plo:pcbi00:1013749
    DOI: 10.1371/journal.pcbi.1013749
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1013749
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1013749&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1013749?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1013749. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.