IDEAS home Printed from https://ideas.repec.org/a/spr/stabio/v17y2025i1d10.1007_s12561-024-09434-9.html
   My bibliography  Save this article

DeepBiome: A Phylogenetic Tree Informed Deep Neural Network for Microbiome Data Analysis

Author

Listed:
  • Jing Zhai

    (University of Arizona)

  • Youngwon Choi

    (Seoul National University
    UCLA Center for Vision & Imaging Biomarkers)

  • Xingyi Yang

    (University of Arizona)

  • Yin Chen

    (University of Arizona)

  • Kenneth Knox

    (University of Arizona)

  • Homer L. Twigg

    (Indiana University Medical Center)

  • Joong-Ho Won

    (Seoul National University)

  • Hua Zhou

    (University of California)

  • Jin J. Zhou

    (University of California
    University of California
    University of California, Los Angeles)

Abstract

Evidence linking the microbiome to human health is rapidly growing. The microbiome profile has the potential as a novel predictive biomarker for many diseases. However, tables of bacterial counts are typically sparse, and bacteria are classified within a hierarchy of taxonomic levels, ranging from species to phylum. Existing tools focus on identifying microbiome associations at either the community level or a specific, pre-defined taxonomic level. Incorporating the evolutionary relationship between bacteria can enhance data interpretation. This approach allows for aggregating microbiome contributions, leading to more accurate and interpretable results. We present DeepBiome, a phylogeny-informed neural network architecture, to predict phenotypes from microbiome counts and uncover the microbiome–phenotype association network. It utilizes microbiome abundance as input and employs phylogenetic taxonomy to guide the neural network’s architecture. Leveraging phylogenetic information, DeepBiome reduces the need for extensive tuning of the deep learning architecture, minimizes overfitting, and, crucially, enables the visualization of the path from microbiome counts to disease. It is applicable to both regression and classification problems. Simulation studies and real-life data analysis have shown that DeepBiome is both highly accurate and efficient. It offers deep insights into complex microbiome–phenotype associations, even with small to moderate training sample sizes. In practice, the specific taxonomic level at which microbiome clusters tag the association remains unknown. Therefore, the main advantage of the presented method over other analytical methods is that it offers an ecological and evolutionary understanding of host–microbe interactions, which is important for microbiome-based medicine. DeepBiome is implemented using Python packages Keras and TensorFlow. It is an open-source tool available at https://github.com/Young-won/DeepBiome .

Suggested Citation

  • Jing Zhai & Youngwon Choi & Xingyi Yang & Yin Chen & Kenneth Knox & Homer L. Twigg & Joong-Ho Won & Hua Zhou & Jin J. Zhou, 2025. "DeepBiome: A Phylogenetic Tree Informed Deep Neural Network for Microbiome Data Analysis," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 17(1), pages 191-215, April.
  • Handle: RePEc:spr:stabio:v:17:y:2025:i:1:d:10.1007_s12561-024-09434-9
    DOI: 10.1007/s12561-024-09434-9
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s12561-024-09434-9
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s12561-024-09434-9?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Jack A. Gilbert & Robert A. Quinn & Justine Debelius & Zhenjiang Z. Xu & James Morton & Neha Garg & Janet K. Jansson & Pieter C. Dorrestein & Rob Knight, 2016. "Microbiome-wide association studies link dynamic microbial consortia to disease," Nature, Nature, vol. 535(7610), pages 94-103, July.
    2. Edoardo Pasolli & Duy Tin Truong & Faizan Malik & Levi Waldron & Nicola Segata, 2016. "Machine Learning Meta-analysis of Large Metagenomic Datasets: Tools and Biological Insights," PLOS Computational Biology, Public Library of Science, vol. 12(7), pages 1-26, July.
    3. Tao Wang & Hongyu Zhao, 2017. "Constructing Predictive Microbial Signatures at Multiple Taxonomic Levels," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(519), pages 1022-1031, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jaron Thompson & Renee Johansen & John Dunbar & Brian Munsky, 2019. "Machine learning to predict microbial community functions: An analysis of dissolved organic carbon from litter decomposition," PLOS ONE, Public Library of Science, vol. 14(7), pages 1-16, July.
    2. Haixiang Zhang & Jun Chen & Zhigang Li & Lei Liu, 2021. "Testing for Mediation Effect with Application to Human Microbiome Data," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 13(2), pages 313-328, July.
    3. Pamela N Luna & Jonathan M Mansbach & Chad A Shaw, 2020. "A joint modeling approach for longitudinal microbiome data improves ability to detect microbiome associations with disease," PLOS Computational Biology, Public Library of Science, vol. 16(12), pages 1-17, December.
    4. Daniel Chang & Vinod K. Gupta & Benjamin Hur & Sergio Cobo-López & Kevin Y. Cunningham & Nam Soo Han & Insuk Lee & Vanessa L. Kronzer & Levi M. Teigen & Lioudmila V. Karnatovskaia & Erin E. Longbrake , 2024. "Gut Microbiome Wellness Index 2 enhances health status prediction from gut microbiome taxonomic profiles," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    5. Paolo Manghi & Michele Filosi & Moreno Zolfo & Lucas G. Casten & Albert Garcia-Valiente & Stefania Mattevi & Vitor Heidrich & Davide Golzato & Samuel Perini & Andrew M. Thomas & Simone Montalbano & Sa, 2024. "Large-scale metagenomic analysis of oral microbiomes reveals markers for autism spectrum disorders," Nature Communications, Nature, vol. 15(1), pages 1-18, December.
    6. Zengliang Jiang & Lai-bao Zhuo & Yan He & Yuanqing Fu & Luqi Shen & Fengzhe Xu & Wanglong Gou & Zelei Miao & Menglei Shuai & Yuhui Liang & Congmei Xiao & Xinxiu Liang & Yunyi Tian & Jiali Wang & Jun T, 2022. "The gut microbiota-bile acid axis links the positive association between chronic insomnia and cardiometabolic diseases," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    7. Grant D Stentiford & Kallaya Sritunyalucksana & Timothy W Flegel & Bryony A P Williams & Boonsirm Withyachumnarnkul & Orn Itsathitphaisarn & David Bass, 2017. "New Paradigms to Help Solve the Global Aquaculture Disease Crisis," PLOS Pathogens, Public Library of Science, vol. 13(2), pages 1-6, February.
    8. Francesca De Filippis & Vincenzo Valentino & Giuseppina Sequino & Giorgia Borriello & Marita Georgia Riccardi & Biancamaria Pierri & Pellegrino Cerino & Antonio Pizzolante & Edoardo Pasolli & Mauro Es, 2024. "Exposure to environmental pollutants selects for xenobiotic-degrading functions in the human gut microbiome," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    9. Qi Su & Qin Liu & Raphaela Iris Lau & Jingwan Zhang & Zhilu Xu & Yun Kit Yeoh & Thomas W. H. Leung & Whitney Tang & Lin Zhang & Jessie Q. Y. Liang & Yuk Kam Yau & Jiaying Zheng & Chengyu Liu & Mengjin, 2022. "Faecal microbiome-based machine learning for multi-class disease diagnosis," Nature Communications, Nature, vol. 13(1), pages 1-8, December.
    10. Doris Vandeputte & Lindsey Commer & Raul Y. Tito & Gunter Kathagen & João Sabino & Séverine Vermeire & Karoline Faust & Jeroen Raes, 2021. "Temporal variability in quantitative human gut microbiome profiles and implications for clinical research," Nature Communications, Nature, vol. 12(1), pages 1-13, December.
    11. Hung-Chih Chen & Yen-Wen Liu & Kuan-Cheng Chang & Yen-Wen Wu & Yi-Ming Chen & Yu-Kai Chao & Min-Yi You & David J. Lundy & Chen-Ju Lin & Marvin L. Hsieh & Yu-Che Cheng & Ray P. Prajnamitra & Po-Ju Lin , 2023. "Gut butyrate-producers confer post-infarction cardiac protection," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    12. Youwen Qin & Xin Tong & Wei-Jian Mei & Yanshuang Cheng & Yuanqiang Zou & Kai Han & Jiehai Yu & Zhuye Jie & Tao Zhang & Shida Zhu & Xin Jin & Jian Wang & Huanming Yang & Xun Xu & Huanzi Zhong & Liang X, 2024. "Consistent signatures in the human gut microbiome of old- and young-onset colorectal cancer," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    13. Bingkai Wang & Brian S. Caffo & Xi Luo & Chin‐Fu Liu & Andreia V. Faria & Michael I. Miller & Yi Zhao & for the Alzheimer's Disease Neuroimaging Initiative*, 2022. "Regularized regression on compositional trees with application to MRI analysis," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(3), pages 541-561, June.
    14. Konstantin Shestopaloff & Mei Dong & Fan Gao & Wei Xu, 2021. "DCMD: Distance-based classification using mixture distributions on microbiome data," PLOS Computational Biology, Public Library of Science, vol. 17(3), pages 1-18, March.
    15. Francesca De Filippis & Lorella Paparo & Rita Nocerino & Giusy Della Gatta & Laura Carucci & Roberto Russo & Edoardo Pasolli & Danilo Ercolini & Roberto Berni Canani, 2021. "Specific gut microbiome signatures and the associated pro-inflamatory functions are linked to pediatric allergy and acquisition of immune tolerance," Nature Communications, Nature, vol. 12(1), pages 1-11, December.
    16. Sean M Gibbons & Claire Duvallet & Eric J Alm, 2018. "Correcting for batch effects in case-control microbiome studies," PLOS Computational Biology, Public Library of Science, vol. 14(4), pages 1-17, April.
    17. Rajita Menon & Vivek Ramanan & Kirill S Korolev, 2018. "Interactions between species introduce spurious associations in microbiome studies," PLOS Computational Biology, Public Library of Science, vol. 14(1), pages 1-20, January.
    18. Alan Le Goallec & Braden T Tierney & Jacob M Luber & Evan M Cofer & Aleksandar D Kostic & Chirag J Patel, 2020. "A systematic machine learning and data type comparison yields metagenomic predictors of infant age, sex, breastfeeding, antibiotic usage, country of origin, and delivery type," PLOS Computational Biology, Public Library of Science, vol. 16(5), pages 1-21, May.
    19. Yi Zhao & Bingkai Wang & Chin‐Fu Liu & Andreia V. Faria & Michael I. Miller & Brian S. Caffo & Xi Luo, 2023. "Identifying brain hierarchical structures associated with Alzheimer's disease using a regularized regression method with tree predictors," Biometrics, The International Biometric Society, vol. 79(3), pages 2333-2345, September.
    20. Efrat Muller & Itamar Shiryan & Elhanan Borenstein, 2024. "Multi-omic integration of microbiome data for identifying disease-associated modules," Nature Communications, Nature, vol. 15(1), pages 1-13, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:stabio:v:17:y:2025:i:1:d:10.1007_s12561-024-09434-9. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.