Author
Listed:
- Daniel Segura
- Divya Sharma
- Osvaldo Espin-Garcia
Abstract
The microbiome is increasingly regarded as a key component of human health, and analysis of microbiome data can aid in the development of precision medicine. Due to the high cost of shotgun metagenomic sequencing (SM-seq), microbiome analyses can be done cost-effectively in two phases: Phase 1-sequencing of 16S ribosomal RNA, and Phase 2-SM-seq of an informative subsample. Existing research suggests strategies to select the subsample based on biological diversity and dissimilarity metrics calculated using operational taxonomic units (OTUs). However, the microbiome field has progressed towards amplicon sequencing variants (ASVs), as they provide more precise microbe identification and sample diversity information. The aim of this work is to compare the subsampling strategies for two-phase metagenomic studies when using ASVs instead of OTUs, and to propose data driven strategies for subsample selection through dimension reduction techniques. We used 199 samples of infant-gut microbiome data from the DIABIMMUNE project to generate ASVs and OTUs, then generated subsamples based on five existing biologically driven subsampling methods and two data driven methods. Linear discriminant analysis Effect Size (LEfSe) was used to assess differential representation of taxa between the subsamples and the overall sample. The use of ASVs showed a 50-93% agreement in the subsample selection with the use of OTUs for the subsampling methods evaluated, and showed a similar bacterial representation across all methods. Although sampling using ASVs and OTUs typically lead to similar results for each subsample, ASVs had more clades that differed in expression levels between allergic and non-allergic individuals across all sample sizes compared to OTUs, and led to more biomarkers discovered at Phase 2-SM-seq level.
Suggested Citation
Daniel Segura & Divya Sharma & Osvaldo Espin-Garcia, 2024.
"Comparing subsampling strategies for metagenomic analysis in microbial studies using amplicon sequence variants versus operational taxonomic units,"
PLOS ONE, Public Library of Science, vol. 19(12), pages 1-19, December.
Handle:
RePEc:plo:pone00:0315720
DOI: 10.1371/journal.pone.0315720
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0315720. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.