Author
Listed:
- Wolfgang Kopp
(Berlin Institute for Medical Systems Biology, Max Delbrueck Center for Molecular Medicine)
- Remo Monti
(Berlin Institute for Medical Systems Biology, Max Delbrueck Center for Molecular Medicine
Digital Health Machine Learning, Hasso Plattner Institute, University of Potsdam)
- Annalaura Tamburrini
(Berlin Institute for Medical Systems Biology, Max Delbrueck Center for Molecular Medicine
University of Rome ‘Tor Vergata’)
- Uwe Ohler
(Berlin Institute for Medical Systems Biology, Max Delbrueck Center for Molecular Medicine
Humboldt University)
- Altuna Akalin
(Berlin Institute for Medical Systems Biology, Max Delbrueck Center for Molecular Medicine)
Abstract
In recent years, numerous applications have demonstrated the potential of deep learning for an improved understanding of biological processes. However, most deep learning tools developed so far are designed to address a specific question on a fixed dataset and/or by a fixed model architecture. Here we present Janggu, a python library facilitates deep learning for genomics applications, aiming to ease data acquisition and model evaluation. Among its key features are special dataset objects, which form a unified and flexible data acquisition and pre-processing framework for genomics data that enables streamlining of future research applications through reusable components. Through a numpy-like interface, these dataset objects are directly compatible with popular deep learning libraries, including keras or pytorch. Janggu offers the possibility to visualize predictions as genomic tracks or by exporting them to the bigWig format as well as utilities for keras-based models. We illustrate the functionality of Janggu on several deep learning genomics applications. First, we evaluate different model topologies for the task of predicting binding sites for the transcription factor JunD. Second, we demonstrate the framework on published models for predicting chromatin effects. Third, we show that promoter usage measured by CAGE can be predicted using DNase hypersensitivity, histone modifications and DNA sequence features. We improve the performance of these models due to a novel feature in Janggu that allows us to include high-order sequence features. We believe that Janggu will help to significantly reduce repetitive programming overhead for deep learning applications in genomics, and will enable computational biologists to rapidly assess biological hypotheses.
Suggested Citation
Wolfgang Kopp & Remo Monti & Annalaura Tamburrini & Uwe Ohler & Altuna Akalin, 2020.
"Deep learning for genomics using Janggu,"
Nature Communications, Nature, vol. 11(1), pages 1-7, December.
Handle:
RePEc:nat:natcom:v:11:y:2020:i:1:d:10.1038_s41467-020-17155-y
DOI: 10.1038/s41467-020-17155-y
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:11:y:2020:i:1:d:10.1038_s41467-020-17155-y. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.