IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1010467.html
   My bibliography  Save this article

Investigating differential abundance methods in microbiome data: A benchmark study

Author

Listed:
  • Marco Cappellato
  • Giacomo Baruzzo
  • Barbara Di Camillo

Abstract

The development of increasingly efficient and cost-effective high throughput DNA sequencing techniques has enhanced the possibility of studying complex microbial systems. Recently, researchers have shown great interest in studying the microorganisms that characterise different ecological niches. Differential abundance analysis aims to find the differences in the abundance of each taxa between two classes of subjects or samples, assigning a significance value to each comparison. Several bioinformatic methods have been specifically developed, taking into account the challenges of microbiome data, such as sparsity, the different sequencing depth constraint between samples and compositionality. Differential abundance analysis has led to important conclusions in different fields, from health to the environment. However, the lack of a known biological truth makes it difficult to validate the results obtained. In this work we exploit metaSPARSim, a microbial sequencing count data simulator, to simulate data with differential abundance features between experimental groups. We perform a complete comparison of recently developed and established methods on a common benchmark with great effort to the reliability of both the simulated scenarios and the evaluation metrics. The performance overview includes the investigation of numerous scenarios, studying the effect on methods’ results on the main covariates such as sample size, percentage of differentially abundant features, sequencing depth, feature variability, normalisation approach and ecological niches. Mainly, we find that methods show a good control of the type I error and, generally, also of the false discovery rate at high sample size, while recall seem to depend on the dataset and sample size.Author summary: The Microbiota is the set of microorganisms that characterize an ecological environment or niche. Several studies have shown that the microbiota is involved in various biological mechanisms that affect the health or balance of the host organism or the ecosystem. New discoveries and insights have been possible thanks to the increasingly efficient sequencing technologies together with the development of bioinformatic computational methods. One of the most interesting analyses in this landscape is the identification of microorganisms that show significant different abundances when two groups of subjects are analysed. Although many computational methods have been developed, it is still unclear which one has the best performance. Therefore, we exploited a simulator of microbiome data to build a simulation framework that allowed us to carry out an extensive benchmarking of the known tools of differential abundance analysis. Our work is not only a starting point to guide analysts in the choice of tools, but also a first step towards a robust, reliable and fair simulation framework.

Suggested Citation

  • Marco Cappellato & Giacomo Baruzzo & Barbara Di Camillo, 2022. "Investigating differential abundance methods in microbiome data: A benchmark study," PLOS Computational Biology, Public Library of Science, vol. 18(9), pages 1-32, September.
  • Handle: RePEc:plo:pcbi00:1010467
    DOI: 10.1371/journal.pcbi.1010467
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1010467
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1010467&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1010467?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Brendan A. Daisley & Ryan M. Chanyi & Kamilah Abdur-Rashid & Kait F. Al & Shaeley Gibbons & John A. Chmiel & Hannah Wilcox & Gregor Reid & Amanda Anderson & Malcolm Dewar & Shiva M. Nair & Joseph Chin, 2020. "Abiraterone acetate preferentially enriches for the gut commensal Akkermansia muciniphila in castrate-resistant prostate cancer patients," Nature Communications, Nature, vol. 11(1), pages 1-11, December.
    2. Brendan A. Daisley & Ryan M. Chanyi & Kamilah Abdur-Rashid & Kait F. Al & Shaeley Gibbons & John A. Chmiel & Hannah Wilcox & Gregor Reid & Amanda Anderson & Malcolm Dewar & Shiva M. Nair & Joseph Chin, 2020. "Author Correction: Abiraterone acetate preferentially enriches for the gut commensal Akkermansia muciniphila in castrate-resistant prostate cancer patients," Nature Communications, Nature, vol. 11(1), pages 1-1, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Koji Hosomi & Mayu Saito & Jonguk Park & Haruka Murakami & Naoko Shibata & Masahiro Ando & Takahiro Nagatake & Kana Konishi & Harumi Ohno & Kumpei Tanisawa & Attayeb Mohsen & Yi-An Chen & Hitoshi Kawa, 2022. "Oral administration of Blautia wexlerae ameliorates obesity and type 2 diabetes via metabolic remodeling of the gut microbiota," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    2. Gabriel Lachance & Karine Robitaille & Jalal Laaraj & Nikunj Gevariya & Thibault V. Varin & Andrei Feldiorean & Fanny Gaignier & Isabelle Bourdeau Julien & Hui Wen Xu & Tarek Hallal & Jean-François Pe, 2024. "The gut microbiome-prostate cancer crosstalk is modulated by dietary polyunsaturated long-chain fatty acids," Nature Communications, Nature, vol. 15(1), pages 1-15, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1010467. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.