IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0057225.html
   My bibliography  Save this article

An Ensemble Method for Predicting Subnuclear Localizations from Primary Protein Structures

Author

Listed:
  • Guo Sheng Han
  • Zu Guo Yu
  • Vo Anh
  • Anaththa P D Krishnajith
  • Yu-Chu Tian

Abstract

Background: Predicting protein subnuclear localization is a challenging problem. Some previous works based on non-sequence information including Gene Ontology annotations and kernel fusion have respective limitations. The aim of this work is twofold: one is to propose a novel individual feature extraction method; another is to develop an ensemble method to improve prediction performance using comprehensive information represented in the form of high dimensional feature vector obtained by 11 feature extraction methods. Methodology/Principal Findings: A novel two-stage multiclass support vector machine is proposed to predict protein subnuclear localizations. It only considers those feature extraction methods based on amino acid classifications and physicochemical properties. In order to speed up our system, an automatic search method for the kernel parameter is used. The prediction performance of our method is evaluated on four datasets: Lei dataset, multi-localization dataset, SNL9 dataset and a new independent dataset. The overall accuracy of prediction for 6 localizations on Lei dataset is 75.2% and that for 9 localizations on SNL9 dataset is 72.1% in the leave-one-out cross validation, 71.7% for the multi-localization dataset and 69.8% for the new independent dataset, respectively. Comparisons with those existing methods show that our method performs better for both single-localization and multi-localization proteins and achieves more balanced sensitivities and specificities on large-size and small-size subcellular localizations. The overall accuracy improvements are 4.0% and 4.7% for single-localization proteins and 6.5% for multi-localization proteins. The reliability and stability of our classification model are further confirmed by permutation analysis. Conclusions: It can be concluded that our method is effective and valuable for predicting protein subnuclear localizations. A web server has been designed to implement the proposed method. It is freely available at http://bioinformatics.awowshop.com/snlpred_page.php.

Suggested Citation

  • Guo Sheng Han & Zu Guo Yu & Vo Anh & Anaththa P D Krishnajith & Yu-Chu Tian, 2013. "An Ensemble Method for Predicting Subnuclear Localizations from Primary Protein Structures," PLOS ONE, Public Library of Science, vol. 8(2), pages 1-14, February.
  • Handle: RePEc:plo:pone00:0057225
    DOI: 10.1371/journal.pone.0057225
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0057225
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0057225&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0057225?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Yu, Zu-Guo & Anh, Vo & Lau, Ka-Sing, 2004. "Fractal analysis of measure representation of large proteins based on the detailed HP model," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 337(1), pages 171-184.
    2. Robert D. Phair & Tom Misteli, 2000. "High mobility of proteins in the mammalian cell nucleus," Nature, Nature, vol. 404(6778), pages 604-609, April.
    3. Kuo-Chen Chou & Hong-Bin Shen, 2010. "A New Method for Predicting the Subcellular Localization of Eukaryotic Proteins with Both Single and Multiple Sites: Euk-mPLoc 2.0," PLOS ONE, Public Library of Science, vol. 5(4), pages 1-9, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Sun, Bingbin & Yao, Jialing & Xi, Lifeng, 2019. "Eigentime identities of fractal sailboat networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 520(C), pages 338-349.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yifeng Qi & Bin Zhang, 2021. "Chromatin network retards nucleoli coalescence," Nature Communications, Nature, vol. 12(1), pages 1-10, December.
    2. Lisa Streit & Timo Kuhn & Thomas Vomhof & Verena Bopp & Albert C. Ludolph & Jochen H. Weishaupt & J. Christof M. Gebhardt & Jens Michaelis & Karin M. Danzer, 2022. "Stress induced TDP-43 mobility loss independent of stress granules," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
    3. Marta Vicioso-Mantis & Raquel Fueyo & Claudia Navarro & Sara Cruz-Molina & Wilfred F. J. Ijcken & Elena Rebollo & Álvaro Rada-Iglesias & Marian A. Martínez-Balbás, 2022. "JMJD3 intrinsically disordered region links the 3D-genome structure to TGFβ-dependent transcription activation," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    4. Brooke E. Danielsson & Bobin George Abraham & Elina Mäntylä & Jolene I. Cabe & Carl R. Mayer & Anna Rekonen & Frans Ek & Daniel E. Conway & Teemu O. Ihalainen, 2023. "Nuclear lamina strain states revealed by intermolecular force biosensor," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    5. Yu-Fei Gao & Lei Chen & Yu-Dong Cai & Kai-Yan Feng & Tao Huang & Yang Jiang, 2012. "Predicting Metabolic Pathways of Small Molecules and Enzymes Based on Interaction Information of Chemicals and Proteins," PLOS ONE, Public Library of Science, vol. 7(9), pages 1-9, September.
    6. Chi-Hua Tung & Chi-Wei Chen & Han-Hao Sun & Yen-Wei Chu, 2017. "Predicting human protein subcellular localization by heterogeneous and comprehensive approaches," PLOS ONE, Public Library of Science, vol. 12(6), pages 1-14, June.
    7. Craciun, Dana & Isvoran, Adriana & Avram, N.M., 2009. "Long range correlation of hydrophilicity and flexibility along the calcium binding protein chains," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 388(21), pages 4609-4618.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0057225. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.