IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1011422.html
   My bibliography  Save this article

VIRify: An integrated detection, annotation and taxonomic classification pipeline using virus-specific protein profile hidden Markov models

Author

Listed:
  • Guillermo Rangel-Pineros
  • Alexandre Almeida
  • Martin Beracochea
  • Ekaterina Sakharova
  • Manja Marz
  • Alejandro Reyes Muñoz
  • Martin Hölzer
  • Robert D Finn

Abstract

The study of viral communities has revealed the enormous diversity and impact these biological entities have on various ecosystems. These observations have sparked widespread interest in developing computational strategies that support the comprehensive characterisation of viral communities based on sequencing data. Here we introduce VIRify, a new computational pipeline designed to provide a user-friendly and accurate functional and taxonomic characterisation of viral communities. VIRify identifies viral contigs and prophages from metagenomic assemblies and annotates them using a collection of viral profile hidden Markov models (HMMs). These include our manually-curated profile HMMs, which serve as specific taxonomic markers for a wide range of prokaryotic and eukaryotic viral taxa and are thus used to reliably classify viral contigs. We tested VIRify on assemblies from two microbial mock communities, a large metagenomics study, and a collection of publicly available viral genomic sequences from the human gut. The results showed that VIRify could identify sequences from both prokaryotic and eukaryotic viruses, and provided taxonomic classifications from the genus to the family rank with an average accuracy of 86.6%. In addition, VIRify allowed the detection and taxonomic classification of a range of prokaryotic and eukaryotic viruses present in 243 marine metagenomic assemblies. Finally, the use of VIRify led to a large expansion in the number of taxonomically classified human gut viral sequences and the improvement of outdated and shallow taxonomic classifications. Overall, we demonstrate that VIRify is a novel and powerful resource that offers an enhanced capability to detect a broad range of viral contigs and taxonomically classify them.Author summary: Viruses are the most abundant biological entities on our planet. Some are relevant pathogens for public health or agriculture. Still, many also play ecological roles that are critical for maintaining ecosystems. Most viruses are yet to be cultured, so their identification and characterisation depend solely on the analysis of DNA or RNA obtained from the environment. Unlike cellular organisms, viruses also lack a universal genetic marker that allows taxonomic profiling of an environmental viral community. We have manually curated a set of specific viral protein models that serve as taxonomic markers for a comprehensive range of viral taxa. Using these protein models, we developed VIRify, a computational pipeline for the detection, annotation, and taxonomic classification of viral sequences obtained from environmental DNA or RNA. Our new pipeline was efficient in detecting and classifying sequences of viruses targeting bacteria or eukaryotic organisms in mock microbial communities, samples from the world’s oceans, and a previously assembled collection of human gut viruses. VIRify is user-friendly, requires minimal interaction with the command line, and was developed with portability in mind. VIRify can enhance the exploration of viral diversity in nature and support the detection of pathogenic viruses with pandemic potential.

Suggested Citation

  • Guillermo Rangel-Pineros & Alexandre Almeida & Martin Beracochea & Ekaterina Sakharova & Manja Marz & Alejandro Reyes Muñoz & Martin Hölzer & Robert D Finn, 2023. "VIRify: An integrated detection, annotation and taxonomic classification pipeline using virus-specific protein profile hidden Markov models," PLOS Computational Biology, Public Library of Science, vol. 19(8), pages 1-28, August.
  • Handle: RePEc:plo:pcbi00:1011422
    DOI: 10.1371/journal.pcbi.1011422
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1011422
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1011422&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1011422?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Alexandre Almeida & Alex L. Mitchell & Miguel Boland & Samuel C. Forster & Gregory B. Gloor & Aleksandra Tarkowska & Trevor D. Lawley & Robert D. Finn, 2019. "A new genomic blueprint of the human gut microbiota," Nature, Nature, vol. 568(7753), pages 499-504, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Li Zhang & Karen R. Jonscher & Zuyuan Zhang & Yi Xiong & Ryan S. Mueller & Jacob E. Friedman & Chongle Pan, 2022. "Islet autoantibody seroconversion in type-1 diabetes is associated with metagenome-assembled genomes in infant gut microbiomes," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    2. Jae-Chang Cho, 2021. "Human microbiome privacy risks associated with summary statistics," PLOS ONE, Public Library of Science, vol. 16(4), pages 1-11, April.
    3. Ying-Li Zhou & Paraskevi Mara & Guo-Jie Cui & Virginia P. Edgcomb & Yong Wang, 2022. "Microbiomes in the Challenger Deep slope and bottom-axis sediments," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    4. Candice R. Gurbatri & Georgette A. Radford & Laura Vrbanac & Jongwon Im & Elaine M. Thomas & Courtney Coker & Samuel R. Taylor & YoungUk Jang & Ayelet Sivan & Kyu Rhee & Anas A. Saleh & Tiffany Chien , 2024. "Engineering tumor-colonizing E. coli Nissle 1917 for detection and treatment of colorectal neoplasia," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    5. Chan Yeong Kim & Junyeong Ma & Insuk Lee, 2022. "HiFi metagenomic sequencing enables assembly of accurate and complete genomes from human gut microbiota," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    6. Jing Guo & Luyao Gong & Haiying Yu & Ming Li & Qiaohui An & Zhenquan Liu & Shuru Fan & Changjialian Yang & Dahe Zhao & Jing Han & Hua Xiang, 2024. "Engineered minimal type I CRISPR-Cas system for transcriptional activation and base editing in human cells," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    7. Caitlin Guccione & Lucas Patel & Yoshihiko Tomofuji & Daniel McDonald & Antonio Gonzalez & Gregory D. Sepich-Poore & Kyuto Sonehara & Mohsen Zakeri & Yang Chen & Amanda Hazel Dilmore & Neil Damle & Se, 2025. "Incomplete human reference genomes can drive false sex biases and expose patient-identifying information in metagenomic data," Nature Communications, Nature, vol. 16(1), pages 1-14, December.
    8. Shaojun Pan & Chengkai Zhu & Xing-Ming Zhao & Luis Pedro Coelho, 2022. "A deep siamese neural network improves metagenome-assembled genomes in microbiome datasets across different environments," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
    9. Zhenmiao Zhang & Jin Xiao & Hongbo Wang & Chao Yang & Yufen Huang & Zhen Yue & Yang Chen & Lijuan Han & Kejing Yin & Aiping Lyu & Xiaodong Fang & Lu Zhang, 2024. "Exploring high-quality microbial genomes by assembling short-reads with long-range connectivity," Nature Communications, Nature, vol. 15(1), pages 1-18, December.
    10. Chiranjib Chakraborty & Ashish Ranjan Sharma & Garima Sharma & Manojit Bhattacharya & Sang-Soo Lee, 2023. "Exploring the status of global terrestrial and aquatic microbial diversity through ‘Biodiversity Informatics’," Environment, Development and Sustainability: A Multidisciplinary Approach to the Theory and Practice of Sustainable Development, Springer, vol. 25(10), pages 10567-10598, October.
    11. Bin Ma & Caiyu Lu & Yiling Wang & Jingwen Yu & Kankan Zhao & Ran Xue & Hao Ren & Xiaofei Lv & Ronghui Pan & Jiabao Zhang & Yongguan Zhu & Jianming Xu, 2023. "A genomic catalogue of soil microbiomes boosts mining of biodiversity and genetic resources," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    12. Fiona B. Tamburini & Dylan Maghini & Ovokeraye H. Oduaran & Ryan Brewster & Michaella R. Hulley & Venesa Sahibdeen & Shane A. Norris & Stephen Tollman & Kathleen Kahn & Ryan G. Wagner & Alisha N. Wade, 2022. "Short- and long-read metagenomics of urban and rural South African gut microbiomes reveal a transitional composition and undescribed taxa," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
    13. Can Chen & Chen Liao & Yang-Yu Liu, 2023. "Teasing out missing reactions in genome-scale metabolic networks through hypergraph learning," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    14. Eleonora Pedrazzoli & Michele Demozzi & Elisabetta Visentin & Matteo Ciciani & Ilaria Bonuzzi & Laura Pezzè & Lorenzo Lucchetta & Giulia Maule & Simone Amistadi & Federica Esposito & Mariangela Lupo &, 2024. "CoCas9 is a compact nuclease from the human microbiome for efficient and precise genome editing," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    15. Shuqin Zeng & Dhrati Patangia & Alexandre Almeida & Zhemin Zhou & Dezhi Mu & R. Paul Ross & Catherine Stanton & Shaopu Wang, 2022. "A compendium of 32,277 metagenome-assembled genomes and over 80 million genes from the early-life human gut microbiome," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    16. Mingyue Cheng & Shuai Luo & Peng Zhang & Guangzhou Xiong & Kai Chen & Chuanqi Jiang & Fangdian Yang & Hanhui Huang & Pengshuo Yang & Guanxi Liu & Yuhao Zhang & Sang Ba & Ping Yin & Jie Xiong & Wei Mia, 2024. "A genome and gene catalog of the aquatic microbiomes of the Tibetan Plateau," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    17. Sigal Leviatan & Saar Shoer & Daphna Rothschild & Maria Gorodetski & Eran Segal, 2022. "An expanded reference map of the human gut microbiome reveals hundreds of previously unknown species," Nature Communications, Nature, vol. 13(1), pages 1-14, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1011422. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.