IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1006641.html
   My bibliography  Save this article

Bayesian inference of protein conformational ensembles from limited structural data

Author

Listed:
  • Wojciech Potrzebowski
  • Jill Trewhella
  • Ingemar Andre

Abstract

Many proteins consist of folded domains connected by regions with higher flexibility. The details of the resulting conformational ensemble play a central role in controlling interactions between domains and with binding partners. Small-Angle Scattering (SAS) is well-suited to study the conformational states adopted by proteins in solution. However, analysis is complicated by the limited information content in SAS data and care must be taken to avoid constructing overly complex ensemble models and fitting to noise in the experimental data. To address these challenges, we developed a method based on Bayesian statistics that infers conformational ensembles from a structural library generated by all-atom Monte Carlo simulations. The first stage of the method involves a fast model selection based on variational Bayesian inference that maximizes the model evidence of the selected ensemble. This is followed by a complete Bayesian inference of population weights in the selected ensemble. Experiments with simulated ensembles demonstrate that model evidence is capable of identifying the correct ensemble and that correct number of ensemble members can be recovered up to high level of noise. Using experimental data, we demonstrate how the method can be extended to include data from Nuclear Magnetic Resonance (NMR) and structural energies of conformers extracted from the all-atom energy functions. We show that the data from SAXS, NMR chemical shifts and energies calculated from conformers can work synergistically to improve the definition of the conformational ensemble.Author summary: Proteins are commonly built up by folded domains connected by regions with higher flexibility. The interdomain orientations encoded by such hinges or linkers can play central roles in controlling the function of multidomain proteins, which makes them important to characterize. Small Angle X-ray Scattering (SAXS) is uniquely suited to study the conformational ensembles adopted by these kinds of proteins. However, because of the limited information provided by SAXS, ensemble models must be built by combination with other information sources and care have to be taken to avoid constructing ensembles that are more complex than data can support. We developed a method based on Bayesian statistics that combine data from molecular simulation with experimental data from SAXS and Nuclear Magnetic Resonance while automatically balancing the complexity of ensemble model with information in the data. We demonstrate that this method is capable of accurate inference of ensembles even in the presence of high levels of experimental noise. The method represents a general approach to combine data and simulation in the modeling of protein ensembles and can be extended to employ additional sources of experimental information.

Suggested Citation

  • Wojciech Potrzebowski & Jill Trewhella & Ingemar Andre, 2018. "Bayesian inference of protein conformational ensembles from limited structural data," PLOS Computational Biology, Public Library of Science, vol. 14(12), pages 1-27, December.
  • Handle: RePEc:plo:pcbi00:1006641
    DOI: 10.1371/journal.pcbi.1006641
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006641
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1006641&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1006641?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Carpenter, Bob & Gelman, Andrew & Hoffman, Matthew D. & Lee, Daniel & Goodrich, Ben & Betancourt, Michael & Brubaker, Marcus & Guo, Jiqiang & Li, Peter & Riddell, Allen, 2017. "Stan: A Probabilistic Programming Language," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 76(i01).
    2. Katherine A. Henzler-Wildman & Ming Lei & Vu Thai & S. Jordan Kerns & Martin Karplus & Dorothee Kern, 2007. "A hierarchy of timescales in protein dynamics is linked to enzyme catalysis," Nature, Nature, vol. 450(7171), pages 913-916, December.
    3. Tanguy Chouard, 2011. "Structural biology: Breaking the protein rules," Nature, Nature, vol. 471(7337), pages 151-153, March.
    4. Katherine Henzler-Wildman & Dorothee Kern, 2007. "Dynamic personalities of proteins," Nature, Nature, vol. 450(7172), pages 964-972, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Sławomir Dorocki & Joanna Korzeniowska, 2023. "Soil Contamination with Metals in Mountainous: A Case Study of Jaworzyna Krynicka in the Beskidy Mountains (Poland)," IJERPH, MDPI, vol. 20(6), pages 1-10, March.
    2. repec:plo:pcbi00:1007870 is not listed on IDEAS

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. repec:plo:pone00:0026936 is not listed on IDEAS
    2. Sean L Seyler & Avishek Kumar & M F Thorpe & Oliver Beckstein, 2015. "Path Similarity Analysis: A Method for Quantifying Macromolecular Pathways," PLOS Computational Biology, Public Library of Science, vol. 11(10), pages 1-37, October.
    3. repec:plo:pcbi00:1000295 is not listed on IDEAS
    4. Francis,David C. & Kubinec ,Robert, 2022. "Beyond Political Connections : A Measurement Model Approach to Estimating Firm-levelPolitical Influence in 41 Economies," Policy Research Working Paper Series 10119, The World Bank.
    5. Yongping Bao & Ludwig Danwitz & Fabian Dvorak & Sebastian Fehrler & Lars Hornuf & Hsuan Yu Lin & Bettina von Helversen, 2022. "Similarity and Consistency in Algorithm-Guided Exploration," CESifo Working Paper Series 10188, CESifo.
    6. Heinrich, Torsten & Yang, Jangho & Dai, Shuanping, 2020. "Growth, development, and structural change at the firm-level: The example of the PR China," MPRA Paper 105011, University Library of Munich, Germany.
    7. Xin Xu & Yang Lu & Yupeng Zhou & Zhiguo Fu & Yanjie Fu & Minghao Yin, 2021. "An Information-Explainable Random Walk Based Unsupervised Network Representation Learning Framework on Node Classification Tasks," Mathematics, MDPI, vol. 9(15), pages 1-14, July.
    8. Spilker Finn & Ötting Marius, 2024. "No cheering in the background? Individual performance in professional darts during COVID-19," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 20(3), pages 219-234.
    9. Xiaoyue Xi & Simon E. F. Spencer & Matthew Hall & M. Kate Grabowski & Joseph Kagaayi & Oliver Ratmann & Rakai Health Sciences Program and PANGEA‐HIV, 2022. "Inferring the sources of HIV infection in Africa from deep‐sequence data with semi‐parametric Bayesian Poisson flow models," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(3), pages 517-540, June.
    10. Luo, Nanyu & Ji, Feng & Han, Yuting & He, Jinbo & Zhang, Xiaoya, 2024. "Fitting item response theory models using deep learning computational frameworks," OSF Preprints tjxab, Center for Open Science.
    11. Joseph B. Bak-Coleman & Ian Kennedy & Morgan Wack & Andrew Beers & Joseph S. Schafer & Emma S. Spiro & Kate Starbird & Jevin D. West, 2022. "Combining interventions to reduce the spread of viral misinformation," Nature Human Behaviour, Nature, vol. 6(10), pages 1372-1380, October.
    12. David M. Phillippo & Sofia Dias & A. E. Ades & Mark Belger & Alan Brnabic & Alexander Schacht & Daniel Saure & Zbigniew Kadziola & Nicky J. Welton, 2020. "Multilevel network meta‐regression for population‐adjusted treatment comparisons," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 183(3), pages 1189-1210, June.
    13. Alina Ferecatu & Arnaud Bruyn & Prithwiraj Mukherjee, 2024. "Silently killing your panelists one email at a time: The true cost of email solicitations," Journal of the Academy of Marketing Science, Springer, vol. 52(4), pages 1216-1239, July.
    14. Alistair Bailey & Andy van Hateren & Tim Elliott & Jörn M Werner, 2014. "Two Polymorphisms Facilitate Differences in Plasticity between Two Chicken Major Histocompatibility Complex Class I Proteins," PLOS ONE, Public Library of Science, vol. 9(2), pages 1-11, February.
    15. Burbano, Vanessa & Padilla, Nicolas & Meier, Stephan, 2020. "Gender Differences in Preferences for Meaning at Work," IZA Discussion Papers 13053, Institute of Labor Economics (IZA).
    16. Robert Kubinec & Haillie Na‐Kyung Lee & Andrey Tomashevskiy, 2021. "Politically connected companies are less likely to shutdown due to COVID‐19 restrictions," Social Science Quarterly, Southwestern Social Science Association, vol. 102(5), pages 2155-2169, September.
    17. Barrington-Leigh, C.P., 2024. "The econometrics of happiness: Are we underestimating the returns to education and income?," Journal of Public Economics, Elsevier, vol. 230(C).
    18. Salvatore Nunnari & Massimiliano Pozzi, 2022. "Meta-Analysis of Inequality Aversion Estimates," CESifo Working Paper Series 9851, CESifo.
    19. Andreas Kryger Jensen & Claus Thorn Ekstrøm, 2021. "Quantifying the trendiness of trends," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 70(1), pages 98-121, January.
    20. Lauderdale, Benjamin E. & Bailey, Delia & Blumenau, Jack & Rivers, Douglas, 2020. "Model-based pre-election polling for national and sub-national outcomes in the US and UK," International Journal of Forecasting, Elsevier, vol. 36(2), pages 399-413.
    21. Michael A Jamros & Leandro C Oliveira & Paul C Whitford & José N Onuchic & Joseph A Adams & Patricia A Jennings, 2012. "Substrate-Specific Reorganization of the Conformational Ensemble of CSK Implicates Novel Modes of Kinase Function," PLOS Computational Biology, Public Library of Science, vol. 8(9), pages 1-8, September.
    22. Tamara Broderick & Ryan Giordano & Rachael Meager, 2020. "An Automatic Finite-Sample Robustness Metric: When Can Dropping a Little Data Make a Big Difference?," Papers 2011.14999, arXiv.org, revised Jul 2023.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1006641. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.