IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0320085.html
   My bibliography  Save this article

Utilizing a deep learning model based on BERT for identifying enhancers and their strength

Author

Listed:
  • Tong Wang
  • Mengqi Gao

Abstract

An enhancer is a specific DNA sequence typically located within a gene at upstream or downstream position and serves as a pivotal element in the regulation of eukaryotic gene transcription. Therefore, the recognition of enhancers is highly significant for comprehending gene expression regulatory systems. While some useful predictive models have been proposed, there are still deficiencies in these models. To address current limitations, we propose a model, DNABERT2-Enhancer, based on transformer architecture and deep learning, designed for the recognition of enhancers (classified as either enhancer or non-enhancer) and the identification of their activity (strong or weak enhancers). More specifically, DNABERT2-Enhancer is composed of a BERT model for extracting features and a CNN model for enhancers classification. Parameters of the BERT model are initialized by a pre-training DNABERT-2 language model. The enhancer recognition task is then fine-tuned through transfer learning to convert the original sequence into feature vectors. Subsequently, the CNN network is employed to learn the feature vector generated by BERT and produce the prediction results. In comparison with existing predictors utilizing the identical dataset, our approach demonstrates superior performance. This suggests that the model will be a useful instrument for academic research on the enhancer recognition.

Suggested Citation

  • Tong Wang & Mengqi Gao, 2025. "Utilizing a deep learning model based on BERT for identifying enhancers and their strength," PLOS ONE, Public Library of Science, vol. 20(4), pages 1-19, April.
  • Handle: RePEc:plo:pone00:0320085
    DOI: 10.1371/journal.pone.0320085
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0320085
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0320085&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0320085?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0320085. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.