IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0166460.html
   My bibliography  Save this article

Introducing a Clustering Step in a Consensus Approach for the Scoring of Protein-Protein Docking Models

Author

Listed:
  • Edrisse Chermak
  • Renato De Donato
  • Marc F Lensink
  • Andrea Petta
  • Luigi Serra
  • Vittorio Scarano
  • Luigi Cavallo
  • Romina Oliva

Abstract

Correctly scoring protein-protein docking models to single out native-like ones is an open challenge. It is also an object of assessment in CAPRI (Critical Assessment of PRedicted Interactions), the community-wide blind docking experiment. We introduced in the field the first pure consensus method, CONSRANK, which ranks models based on their ability to match the most conserved contacts in the ensemble they belong to. In CAPRI, scorers are asked to evaluate a set of available models and select the top ten ones, based on their own scoring approach. Scorers’ performance is ranked based on the number of targets/interfaces for which they could provide at least one correct solution. In such terms, blind testing in CAPRI Round 30 (a joint prediction round with CASP11) has shown that critical cases for CONSRANK are represented by targets showing multiple interfaces or for which only a very small number of correct solutions are available. To address these challenging cases, CONSRANK has now been modified to include a contact-based clustering of the models as a preliminary step of the scoring process. We used an agglomerative hierarchical clustering based on the number of common inter-residue contacts within the models. Two criteria, with different thresholds, were explored in the cluster generation, setting either the number of common contacts or of total clusters. For each clustering approach, after selecting the top (most populated) ten clusters, CONSRANK was run on these clusters and the top-ranked model for each cluster was selected, in the limit of 10 models per target. We have applied our modified scoring approach, Clust-CONSRANK, to SCORE_SET, a set of CAPRI scoring models made recently available by CAPRI assessors, and to the subset of homodimeric targets in CAPRI Round 30 for which CONSRANK failed to include a correct solution within the ten selected models. Results show that, for the challenging cases, the clustering step typically enriches the ten top ranked models in native-like solutions. The best performing clustering approaches we tested indeed lead to more than double the number of cases for which at least one correct solution can be included within the top ten ranked models.

Suggested Citation

  • Edrisse Chermak & Renato De Donato & Marc F Lensink & Andrea Petta & Luigi Serra & Vittorio Scarano & Luigi Cavallo & Romina Oliva, 2016. "Introducing a Clustering Step in a Consensus Approach for the Scoring of Protein-Protein Docking Models," PLOS ONE, Public Library of Science, vol. 11(11), pages 1-15, November.
  • Handle: RePEc:plo:pone00:0166460
    DOI: 10.1371/journal.pone.0166460
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0166460
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0166460&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0166460?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Müllner, Daniel, 2013. "fastcluster: Fast Hierarchical, Agglomerative Clustering Routines for R and Python," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 53(i09).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Bahman Panahi & Mohammad Farhadian & Mohammad Amin Hejazi, 2020. "Systems biology approach identifies functional modules and regulatory hubs related to secondary metabolites accumulation after transition from autotrophic to heterotrophic growth condition in microalg," PLOS ONE, Public Library of Science, vol. 15(2), pages 1-15, February.
    2. Benedict Anchang & Mary T Do & Xi Zhao & Sylvia K Plevritis, 2014. "CCAST: A Model-Based Gating Strategy to Isolate Homogeneous Subpopulations in a Heterogeneous Population of Single Cells," PLOS Computational Biology, Public Library of Science, vol. 10(7), pages 1-14, July.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0166460. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.