IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v179y2023ics0167947322002390.html
   My bibliography  Save this article

A sparse Bayesian hierarchical vector autoregressive model for microbial dynamics in a wastewater treatment plant

Author

Listed:
  • Hannaford, Naomi E.
  • Heaps, Sarah E.
  • Nye, Tom M.W.
  • Curtis, Thomas P.
  • Allen, Ben
  • Golightly, Andrew
  • Wilkinson, Darren J.

Abstract

Proper function of a wastewater treatment plant (WWTP) relies on maintaining a delicate balance between a multitude of competing microorganisms. Gaining a detailed understanding of the complex network of interactions therein is essential to maximising not only current operational efficiencies, but also for the effective design of new treatment technologies. Metagenomics offers an insight into these dynamic systems through the analysis of the microbial DNA sequences present. Unique taxa are deduced through sequence clustering to form operational taxonomic units (OTUs), with per-taxa abundance estimates obtained from corresponding sequence counts. The data in this study comprise weekly OTU counts from an activated sludge (AS) tank of a WWTP along with corresponding measurements of chemical and environmental (CE) covariates. Directly fitting a model to the OTU data is incredibly challenging because of the high dimensionality and sparsity of the observations. The first step is therefore to aggregate the OTUs into twelve microbial communities or “bins” using a seasonal phase-based clustering approach. The mean abundances in the twelve bins are assumed to vary over time according to a multivariate linear regression on the CE covariates. Deviations from the mean are then modelled using a vector autoregressive (VAR) model of order one, which is a linear approximation to the commonly used generalised Lotka-Volterra (gLV) model. Sparsity is assumed in the interactions between microbial communities by carrying out inference in a hierarchical Bayesian framework which uses a shrinkage prior for the autoregressive coefficient matrix of the VAR model. Different shrinkage priors are explored by analysing simulated data sets before selecting the regularised horseshoe prior for the biological application. It is found that ammonia and chemical oxygen demand have a positive relationship with several bins and pH has a positive relationship with one bin. These results are supported by findings in the biological literature. Several negative interactions are also identified. These novel biological findings suggest OTUs in different bins may be competing for resources and that these relationships are complex. Although simpler than a gLV model, the VAR model is still able to offer valuable insight into the microbial dynamics of the WWTP.

Suggested Citation

  • Hannaford, Naomi E. & Heaps, Sarah E. & Nye, Tom M.W. & Curtis, Thomas P. & Allen, Ben & Golightly, Andrew & Wilkinson, Darren J., 2023. "A sparse Bayesian hierarchical vector autoregressive model for microbial dynamics in a wastewater treatment plant," Computational Statistics & Data Analysis, Elsevier, vol. 179(C).
  • Handle: RePEc:eee:csdana:v:179:y:2023:i:c:s0167947322002390
    DOI: 10.1016/j.csda.2022.107659
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947322002390
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2022.107659?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Sean M Gibbons & Sean M Kearney & Chris S Smillie & Eric J Alm, 2017. "Two dynamic regimes in the human gut microbiome," PLOS Computational Biology, Public Library of Science, vol. 13(2), pages 1-20, February.
    2. Carpenter, Bob & Gelman, Andrew & Hoffman, Matthew D. & Lee, Daniel & Goodrich, Ben & Betancourt, Michael & Brubaker, Marcus & Guo, Jiqiang & Li, Peter & Riddell, Allen, 2017. "Stan: A Probabilistic Programming Language," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 76(i01).
    3. Carlos M. Carvalho & Nicholas G. Polson & James G. Scott, 2010. "The horseshoe estimator for sparse signals," Biometrika, Biometrika Trust, vol. 97(2), pages 465-480.
    4. Daniel Felix Ahelegbey & Monica Billio & Roberto Casarin, 2016. "Bayesian Graphical Models for STructural Vector Autoregressive Processes," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 31(2), pages 357-386, March.
    5. Maartje A. H. J. van Kessel & Daan R. Speth & Mads Albertsen & Per H. Nielsen & Huub J. M. Op den Camp & Boran Kartal & Mike S. M. Jetten & Sebastian Lücker, 2015. "Complete nitrification by a single microorganism," Nature, Nature, vol. 528(7583), pages 555-559, December.
    6. Charles K Fisher & Pankaj Mehta, 2014. "Identifying Keystone Species in the Human Gut Microbiome from Metagenomic Timeseries Using Sparse Linear Regression," PLOS ONE, Public Library of Science, vol. 9(7), pages 1-10, July.
    7. Holger Daims & Elena V. Lebedeva & Petra Pjevac & Ping Han & Craig Herbold & Mads Albertsen & Nico Jehmlich & Marton Palatinszky & Julia Vierheilig & Alexandr Bulaev & Rasmus H. Kirkegaard & Martin vo, 2015. "Complete nitrification by Nitrospira bacteria," Nature, Nature, vol. 528(7583), pages 504-509, December.
    8. Jingru Zhang & Wei Lin, 2019. "Scalable estimation and regularization for the logistic normal multinomial model," Biometrics, The International Biometric Society, vol. 75(4), pages 1098-1108, December.
    9. Mark Girolami & Ben Calderhead, 2011. "Riemann manifold Langevin and Hamiltonian Monte Carlo methods," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 73(2), pages 123-214, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Agudze, Komla M. & Billio, Monica & Casarin, Roberto & Ravazzolo, Francesco, 2022. "Markov switching panel with endogenous synchronization effects," Journal of Econometrics, Elsevier, vol. 230(2), pages 281-298.
    2. He, Yanying & Li, Yiming & Li, Xuecheng & Liu, Yingrui & Wang, Yufen & Guo, Haixiao & Hou, Jiaqi & Zhu, Tingting & Liu, Yiwen, 2023. "Net-zero greenhouse gas emission from wastewater treatment: Mechanisms, opportunities and perspectives," Renewable and Sustainable Energy Reviews, Elsevier, vol. 184(C).
    3. Qiong Wan & Qingji Han & Hailin Luo & Tao He & Feng Xue & Zihuizhong Ye & Chen Chen & Shan Huang, 2020. "Ceramsite Facilitated Microbial Degradation of Pollutants in Domestic Wastewater," IJERPH, MDPI, vol. 17(13), pages 1-13, June.
    4. Sharif Hossain & Christopher W. K. Chow & David Cook & Emma Sawade & Guna A. Hewa, 2022. "Review of Nitrification Monitoring and Control Strategies in Drinking Water System," IJERPH, MDPI, vol. 19(7), pages 1-31, March.
    5. Marina Riabiz & Wilson Ye Chen & Jon Cockayne & Pawel Swietach & Steven A. Niederer & Lester Mackey & Chris. J. Oates, 2022. "Optimal thinning of MCMC output," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(4), pages 1059-1081, September.
    6. Dellaportas, Petros & Titsias, Michalis K. & Petrova, Katerina & Plataniotis, Anastasios, 2023. "Scalable inference for a full multivariate stochastic volatility model," Journal of Econometrics, Elsevier, vol. 232(2), pages 501-520.
    7. Nkulu Rolly Kabange & Youngho Kwon & So-Myeong Lee & Ju-Won Kang & Jin-Kyung Cha & Hyeonjin Park & Gamenyah Daniel Dzorkpe & Dongjin Shin & Ki-Won Oh & Jong-Hee Lee, 2023. "Mitigating Greenhouse Gas Emissions from Crop Production and Management Practices, and Livestock: A Review," Sustainability, MDPI, vol. 15(22), pages 1-41, November.
    8. Boonstra, Philip S. & Barbaro, Ryan P. & Sen, Ananda, 2019. "Default priors for the intercept parameter in logistic regressions," Computational Statistics & Data Analysis, Elsevier, vol. 133(C), pages 245-256.
    9. Paul A. Parker & Scott H. Holan, 2023. "A Bayesian functional data model for surveys collected under informative sampling with application to mortality estimation using NHANES," Biometrics, The International Biometric Society, vol. 79(2), pages 1397-1408, June.
    10. Li, Jie & Shen, Xuzhu & Li, YaoTang, 2021. "Modeling the temporal dynamics of gut microbiota from a local community perspective," Ecological Modelling, Elsevier, vol. 460(C).
    11. Shengbo Gu & Leibin Liu & Xiaojie Zhuang & Jinsheng Qiu & Zhi Zhou, 2022. "Enhanced Nitrogen Removal in a Pilot-Scale Anoxic/Aerobic (A/O) Process Coupling PE Carrier and Nitrifying Bacteria PE Carrier: Performance and Microbial Shift," Sustainability, MDPI, vol. 14(12), pages 1-20, June.
    12. Kreuzer, Alexander & Dalla Valle, Luciana & Czado, Claudia, 2023. "Bayesian multivariate nonlinear state space copula models," Computational Statistics & Data Analysis, Elsevier, vol. 188(C).
    13. Friederike L. Pennemann & Assel Mussabekova & Christian Urban & Alexey Stukalov & Line Lykke Andersen & Vincent Grass & Teresa Maria Lavacca & Cathleen Holze & Lila Oubraham & Yasmine Benamrouche & En, 2021. "Cross-species analysis of viral nucleic acid interacting proteins identifies TAOKs as innate immune regulators," Nature Communications, Nature, vol. 12(1), pages 1-22, December.
    14. Damien McParland & Szymon Baron & Sarah O’Rourke & Denis Dowling & Eamonn Ahearne & Andrew Parnell, 2019. "Prediction of tool-wear in turning of medical grade cobalt chromium molybdenum alloy (ASTM F75) using non-parametric Bayesian models," Journal of Intelligent Manufacturing, Springer, vol. 30(3), pages 1259-1270, March.
    15. Filippo Pagani & Martin Wiegand & Saralees Nadarajah, 2022. "An n‐dimensional Rosenbrock distribution for Markov chain Monte Carlo testing," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 49(2), pages 657-680, June.
    16. Niloy Biswas & Anirban Bhattacharya & Pierre E. Jacob & James E. Johndrow, 2022. "Coupling‐based convergence assessment of some Gibbs samplers for high‐dimensional Bayesian regression with shrinkage priors," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(3), pages 973-996, July.
    17. Posch, Konstantin & Truden, Christian & Hungerländer, Philipp & Pilz, Jürgen, 2022. "A Bayesian approach for predicting food and beverage sales in staff canteens and restaurants," International Journal of Forecasting, Elsevier, vol. 38(1), pages 321-338.
    18. Anindya Bhadra & Jyotishka Datta & Yunfan Li & Nicholas Polson, 2020. "Horseshoe Regularisation for Machine Learning in Complex and Deep Models," International Statistical Review, International Statistical Institute, vol. 88(2), pages 302-320, August.
    19. Jair Andrade & Jim Duggan, 2021. "A Bayesian approach to calibrate system dynamics models using Hamiltonian Monte Carlo," System Dynamics Review, System Dynamics Society, vol. 37(4), pages 283-309, October.
    20. Gruber, Lutz F. & West, Mike, 2017. "Bayesian online variable selection and scalable multivariate volatility forecasting in simultaneous graphical dynamic linear models," Econometrics and Statistics, Elsevier, vol. 3(C), pages 3-22.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:179:y:2023:i:c:s0167947322002390. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.