IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1006185.html
   My bibliography  Save this article

miRAW: A deep learning-based approach to predict microRNA targets by analyzing whole microRNA transcripts

Author

Listed:
  • Albert Pla
  • Xiangfu Zhong
  • Simon Rayner

Abstract

MicroRNAs (miRNAs) are small non-coding RNAs that regulate gene expression by binding to partially complementary regions within the 3’UTR of their target genes. Computational methods play an important role in target prediction and assume that the miRNA “seed region” (nt 2 to 8) is required for functional targeting, but typically only identify ∼80% of known bindings. Recent studies have highlighted a role for the entire miRNA, suggesting that a more flexible methodology is needed. We present a novel approach for miRNA target prediction based on Deep Learning (DL) which, rather than incorporating any knowledge (such as seed regions), investigates the entire miRNA and 3’TR mRNA nucleotides to learn a uninhibited set of feature descriptors related to the targeting process. We collected more than 150,000 experimentally validated homo sapiens miRNA:gene targets and cross referenced them with different CLIP-Seq, CLASH and iPAR-CLIP datasets to obtain ∼20,000 validated miRNA:gene exact target sites. Using this data, we implemented and trained a deep neural network—composed of autoencoders and a feed-forward network—able to automatically learn features describing miRNA-mRNA interactions and assess functionality. Predictions were then refined using information such as site location or site accessibility energy. In a comparison using independent datasets, our DL approach consistently outperformed existing prediction methods, recognizing the seed region as a common feature in the targeting process, but also identifying the role of pairings outside this region. Thermodynamic analysis also suggests that site accessibility plays a role in targeting but that it cannot be used as a sole indicator for functionality. Data and source code available at: https://bitbucket.org/account/user/bipous/projects/MIRAW.Author summary: microRNAs are small RNA molecules that regulate biological processes by binding to the 3’UTR of a gene and their dysregulation is associated with several diseases. Computationally predicting these targets remains a challenge as they only partially match their target and so there can be hundreds of targets for a single microRNA. Current tools assume that most of the knowledge defining a microRNA-gene interaction can be captured by analysing the binding produced in the seed region (∼ the first 8nt in the miRNA). However, recent studies show that the whole microRNA can be important and form non-canonical targets. Here, we use a target prediction methodology that relies on deep neural networks to automatically learn the relevant features describing microRNA-gene interactions for predicting microRNA targets. This means we make no assumptions about what is important, leaving the task to the deep neural network. A key part of the work is obtaining a suitable dataset. Thus, we collected and curated more than 150,000 experimentally verified microRNA targets and used them to train the network. Using this approach, we are able to gain a better understanding of non-canonical targets and to improve the accuracy of state-of-the-art prediction tools.

Suggested Citation

  • Albert Pla & Xiangfu Zhong & Simon Rayner, 2018. "miRAW: A deep learning-based approach to predict microRNA targets by analyzing whole microRNA transcripts," PLOS Computational Biology, Public Library of Science, vol. 14(7), pages 1-32, July.
  • Handle: RePEc:plo:pcbi00:1006185
    DOI: 10.1371/journal.pcbi.1006185
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006185
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1006185&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1006185?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1006185. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.