IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0035666.html
   My bibliography  Save this article

L1pred: A Sequence-Based Prediction Tool for Catalytic Residues in Enzymes with the L1-logreg Classifier

Author

Listed:
  • Yongchao Dou
  • Jun Wang
  • Jialiang Yang
  • Chi Zhang

Abstract

To understand enzyme functions, identifying the catalytic residues is a usual first step. Moreover, knowledge about catalytic residues is also useful for protein engineering and drug-design. However, to experimentally identify catalytic residues remains challenging for reasons of time and cost. Therefore, computational methods have been explored to predict catalytic residues. Here, we developed a new algorithm, L1pred, for catalytic residue prediction, by using the L1-logreg classifier to integrate eight sequence-based scoring functions. We tested L1pred and compared it against several existing sequence-based methods on carefully designed datasets Data604 and Data63. With ten-fold cross-validation, L1pred showed the area under precision-recall curve (AUPR) and the area under ROC curve (AUC) of 0.2198 and 0.9494 on the training dataset, Data604, respectively. In addition, on the independent test dataset, Data63, it showed the AUPR and AUC values of 0.2636 and 0.9375, respectively. Compared with other sequence-based methods, L1pred showed the best performance on both datasets. We also analyzed the importance of each attribute in the algorithm, and found that all the scores contributed more or less equally to the L1pred performance.

Suggested Citation

  • Yongchao Dou & Jun Wang & Jialiang Yang & Chi Zhang, 2012. "L1pred: A Sequence-Based Prediction Tool for Catalytic Residues in Enzymes with the L1-logreg Classifier," PLOS ONE, Public Library of Science, vol. 7(4), pages 1-7, April.
  • Handle: RePEc:plo:pone00:0035666
    DOI: 10.1371/journal.pone.0035666
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0035666
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0035666&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0035666?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Cristina Marino Buslje & Elin Teppa & Tomas Di Doménico & José María Delfino & Morten Nielsen, 2010. "Networks of High Mutual Information Define the Structural Proximity of Catalytic Sites: Implications for Catalytic Residue Identification," PLOS Computational Biology, Public Library of Science, vol. 6(11), pages 1-8, November.
    2. Wenxu Tong & Ying Wei & Leonel F Murga & Mary Jo Ondrechen & Ronald J Williams, 2009. "Partial Order Optimum Likelihood (POOL): Maximum Likelihood Prediction of Protein Active Site Residues Using 3D Structure and Sequence Properties," PLOS Computational Biology, Public Library of Science, vol. 5(1), pages 1-15, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yu-Tung Chien & Shao-Wei Huang, 2012. "Accurate Prediction of Protein Catalytic Residues by Side Chain Orientation and Residue Contact Density," PLOS ONE, Public Library of Science, vol. 7(10), pages 1-11, October.
    2. Cristina Marino Buslje & Elin Teppa & Tomas Di Doménico & José María Delfino & Morten Nielsen, 2010. "Networks of High Mutual Information Define the Structural Proximity of Catalytic Sites: Implications for Catalytic Residue Identification," PLOS Computational Biology, Public Library of Science, vol. 6(11), pages 1-8, November.
    3. John A Capra & Roman A Laskowski & Janet M Thornton & Mona Singh & Thomas A Funkhouser, 2009. "Predicting Protein Ligand Binding Sites by Combining Evolutionary Sequence Conservation and 3D Structure," PLOS Computational Biology, Public Library of Science, vol. 5(12), pages 1-18, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0035666. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.