IDEAS home Printed from https://ideas.repec.org/a/spr/jagbes/v26y2021i2d10.1007_s13253-021-00447-1.html
   My bibliography  Save this article

A Statistical Perspective on the Challenges in Molecular Microbial Biology

Author

Listed:
  • Pratheepa Jeganathan

    (Stanford University)

  • Susan P. Holmes

    (Stanford University)

Abstract

High throughput sequencing (HTS)-based technology enables identifying and quantifying non-culturable microbial organisms in all environments. Microbial sequences have enhanced our understanding of the human microbiome, the soil and plant environment, and the marine environment. All molecular microbial data pose statistical challenges due to contamination sequences from reagents, batch effects, unequal sampling, and undetected taxa. Technical biases and heteroscedasticity have the strongest effects, but different strains across subjects and environments also make direct differential abundance testing unwieldy. We provide an introduction to a few statistical tools that can overcome some of these difficulties and demonstrate those tools on an example. We show how standard statistical methods, such as simple hierarchical mixture and topic models, can facilitate inferences on latent microbial communities. We also review some nonparametric Bayesian approaches that combine visualization and uncertainty quantification. The intersection of molecular microbial biology and statistics is an exciting new venue. Finally, we list some of the important open problems that would benefit from more careful statistical method development.

Suggested Citation

  • Pratheepa Jeganathan & Susan P. Holmes, 2021. "A Statistical Perspective on the Challenges in Molecular Microbial Biology," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 26(2), pages 131-160, June.
  • Handle: RePEc:spr:jagbes:v:26:y:2021:i:2:d:10.1007_s13253-021-00447-1
    DOI: 10.1007/s13253-021-00447-1
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s13253-021-00447-1
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s13253-021-00447-1?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Steven N. Evans & Frederick A. Matsen, 2012. "The phylogenetic Kantorovich–Rubinstein metric for environmental sequence samples," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 74(3), pages 569-592, June.
    2. Boyu Ren & Sergio Bacallado & Stefano Favaro & Susan Holmes & Lorenzo Trippa, 2017. "Bayesian Nonparametric Ordination for the Analysis of Microbial Communities," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(520), pages 1430-1442, October.
    3. Carpenter, Bob & Gelman, Andrew & Hoffman, Matthew D. & Lee, Daniel & Goodrich, Ben & Betancourt, Michael & Brubaker, Marcus & Guo, Jiqiang & Li, Peter & Riddell, Allen, 2017. "Stan: A Probabilistic Programming Language," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 76(i01).
    4. Lan Huong Nguyen & Susan Holmes, 2019. "Ten quick tips for effective dimensionality reduction," PLOS Computational Biology, Public Library of Science, vol. 15(6), pages 1-19, June.
    5. Paul J McMurdie & Susan Holmes, 2014. "Waste Not, Want Not: Why Rarefying Microbiome Data Is Inadmissible," PLOS Computational Biology, Public Library of Science, vol. 10(4), pages 1-12, April.
    6. Zachary D Kurtz & Christian L Müller & Emily R Miraldi & Dan R Littman & Martin J Blaser & Richard A Bonneau, 2015. "Sparse and Compositionally Robust Inference of Microbial Ecological Networks," PLOS Computational Biology, Public Library of Science, vol. 11(5), pages 1-25, May.
    7. Sankaran, Kris & Holmes, Susan, 2014. "structSSI: Simultaneous and Selective Inference for Grouped or Hierarchically Structured Data," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 59(i13).
    8. Tanya Yatsunenko & Federico E. Rey & Mark J. Manary & Indi Trehan & Maria Gloria Dominguez-Bello & Monica Contreras & Magda Magris & Glida Hidalgo & Robert N. Baldassano & Andrey P. Anokhin & Andrew C, 2012. "Human gut microbiome viewed across age and geography," Nature, Nature, vol. 486(7402), pages 222-227, June.
    9. Ian Holmes & Keith Harris & Christopher Quince, 2012. "Dirichlet Multinomial Mixtures: Generative Models for Microbial Metagenomics," PLOS ONE, Public Library of Science, vol. 7(2), pages 1-15, February.
    10. Manuel Delgado-Baquerizo & Fernando T. Maestre & Peter B. Reich & Thomas C. Jeffries & Juan J. Gaitan & Daniel Encinar & Miguel Berdugo & Colin D. Campbell & Brajesh K. Singh, 2016. "Microbial diversity drives multifunctionality in terrestrial ecosystems," Nature Communications, Nature, vol. 7(1), pages 1-8, April.
    11. Lizhen Xu & Andrew D Paterson & Williams Turpin & Wei Xu, 2015. "Assessment and Selection of Competing Models for Zero-Inflated Microbiome Data," PLOS ONE, Public Library of Science, vol. 10(7), pages 1-30, July.
    12. Diana M. Proctor & Julia A. Fukuyama & Peter M. Loomer & Gary C. Armitage & Stacey A. Lee & Nicole M. Davis & Mark I. Ryder & Susan P. Holmes & David A. Relman, 2018. "A spatial gradient of bacterial diversity in the human oral cavity shaped by salivary flow," Nature Communications, Nature, vol. 9(1), pages 1-10, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Patrick LeBlanc & Li Ma, 2023. "Microbiome subcommunity learning with logistic‐tree normal latent Dirichlet allocation," Biometrics, The International Biometric Society, vol. 79(3), pages 2321-2332, September.
    2. Can Cui & Susheela P. Singh & Ana‐Maria Staicu & Brian J. Reich, 2021. "Bayesian variable selection for high‐dimensional rank data," Environmetrics, John Wiley & Sons, Ltd., vol. 32(7), November.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Duo Jiang & Thomas Sharpton & Yuan Jiang, 2021. "Microbial Interaction Network Estimation via Bias-Corrected Graphical Lasso," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 13(2), pages 329-350, July.
    2. Amanda H Pendegraft & Boyi Guo & Nengjun Yi, 2019. "Bayesian hierarchical negative binomial models for multivariable analyses with applications to human microbiome count data," PLOS ONE, Public Library of Science, vol. 14(8), pages 1-23, August.
    3. Yaru Song & Hongyu Zhao & Tao Wang, 2020. "An adaptive independence test for microbiome community data," Biometrics, The International Biometric Society, vol. 76(2), pages 414-426, June.
    4. Tianchen Xu & Ryan T. Demmer & Gen Li, 2021. "Zero‐inflated Poisson factor model with application to microbiome read counts," Biometrics, The International Biometric Society, vol. 77(1), pages 91-101, March.
    5. Andrea Quagliariello & Alessandra Modi & Gabriel Innocenti & Valentina Zaro & Cecilia Conati Barbaro & Annamaria Ronchitelli & Francesco Boschin & Claudio Cavazzuti & Elena Dellù & Francesca Radina & , 2022. "Ancient oral microbiomes support gradual Neolithic dietary shifts towards agriculture," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    6. Sean M Gibbons & Sean M Kearney & Chris S Smillie & Eric J Alm, 2017. "Two dynamic regimes in the human gut microbiome," PLOS Computational Biology, Public Library of Science, vol. 13(2), pages 1-20, February.
    7. Doris Vandeputte & Lindsey Commer & Raul Y. Tito & Gunter Kathagen & João Sabino & Séverine Vermeire & Karoline Faust & Jeroen Raes, 2021. "Temporal variability in quantitative human gut microbiome profiles and implications for clinical research," Nature Communications, Nature, vol. 12(1), pages 1-13, December.
    8. Chieh Lo & Radu Marculescu, 2017. "MPLasso: Inferring microbial association networks using prior microbial knowledge," PLOS Computational Biology, Public Library of Science, vol. 13(12), pages 1-20, December.
    9. Aaron C Ericsson & J Wade Davis & William Spollen & Nathan Bivens & Scott Givan & Catherine E Hagan & Mark McIntosh & Craig L Franklin, 2015. "Effects of Vendor and Genetic Background on the Composition of the Fecal Microbiota of Inbred Mice," PLOS ONE, Public Library of Science, vol. 10(2), pages 1-19, February.
    10. Francis,David C. & Kubinec ,Robert, 2022. "Beyond Political Connections : A Measurement Model Approach to Estimating Firm-levelPolitical Influence in 41 Economies," Policy Research Working Paper Series 10119, The World Bank.
    11. Mozhaeva, Irina, 2022. "Inequalities in utilization of institutional care among older people in Estonia," Health Policy, Elsevier, vol. 126(7), pages 704-714.
    12. Martinovici, A., 2019. "Revealing attention - how eye movements predict brand choice and moment of choice," Other publications TiSEM 7dca38a5-9f78-4aee-bd81-c, Tilburg University, School of Economics and Management.
    13. Yongping Bao & Ludwig Danwitz & Fabian Dvorak & Sebastian Fehrler & Lars Hornuf & Hsuan Yu Lin & Bettina von Helversen, 2022. "Similarity and Consistency in Algorithm-Guided Exploration," CESifo Working Paper Series 10188, CESifo.
    14. Torsten Heinrich & Jangho Yang & Shuanping Dai, 2020. "Growth, development, and structural change at the firm-level: The example of the PR China," Papers 2012.14503, arXiv.org.
    15. Ruairi C. Robertson & Thaddeus J. Edens & Lynnea Carr & Kuda Mutasa & Ethan K. Gough & Ceri Evans & Hyun Min Geum & Iman Baharmand & Sandeep K. Gill & Robert Ntozini & Laura E. Smith & Bernard Chasekw, 2023. "The gut microbiome and early-life growth in a population with high prevalence of stunting," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    16. van Kesteren Erik-Jan & Bergkamp Tom, 2023. "Bayesian analysis of Formula One race results: disentangling driver skill and constructor advantage," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 19(4), pages 273-293, December.
    17. Xin Xu & Yang Lu & Yupeng Zhou & Zhiguo Fu & Yanjie Fu & Minghao Yin, 2021. "An Information-Explainable Random Walk Based Unsupervised Network Representation Learning Framework on Node Classification Tasks," Mathematics, MDPI, vol. 9(15), pages 1-14, July.
    18. John Molloy & Katrina Allen & Fiona Collier & Mimi L. K. Tang & Alister C. Ward & Peter Vuillermin, 2013. "The Potential Link between Gut Microbiota and IgE-Mediated Food Allergy in Early Life," IJERPH, MDPI, vol. 10(12), pages 1-22, December.
    19. Xiaoyue Xi & Simon E. F. Spencer & Matthew Hall & M. Kate Grabowski & Joseph Kagaayi & Oliver Ratmann & Rakai Health Sciences Program and PANGEA‐HIV, 2022. "Inferring the sources of HIV infection in Africa from deep‐sequence data with semi‐parametric Bayesian Poisson flow models," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(3), pages 517-540, June.
    20. Kuschnig, Nikolas, 2021. "Bayesian Spatial Econometrics and the Need for Software," Department of Economics Working Paper Series 318, WU Vienna University of Economics and Business.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:jagbes:v:26:y:2021:i:2:d:10.1007_s13253-021-00447-1. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.