Benchmarking uncertainty quantification for protein engineering

My bibliography Save this article

Benchmarking uncertainty quantification for protein engineering

Author

Listed:

Kevin P Greenman
Ava P Amini
Kevin K Yang

Registered:

Abstract

Machine learning sequence-function models for proteins could enable significant advances in protein engineering, especially when paired with state-of-the-art methods to select new sequences for property optimization and/or model improvement. Such methods (Bayesian optimization and active learning) require calibrated estimations of model uncertainty. While studies have benchmarked a variety of deep learning uncertainty quantification (UQ) methods on standard and molecular machine-learning datasets, it is not clear if these results extend to protein datasets. In this work, we implemented a panel of deep learning UQ methods on regression tasks from the Fitness Landscape Inference for Proteins (FLIP) benchmark. We compared results across different degrees of distributional shift using metrics that assess each UQ method’s accuracy, calibration, coverage, width, and rank correlation. Additionally, we compared these metrics using one-hot encoding and pretrained language model representations, and we tested the UQ methods in retrospective active learning and Bayesian optimization settings. Our results indicate that there is no single best UQ method across all datasets, splits, and metrics, and that uncertainty-based sampling is often unable to outperform greedy sampling in Bayesian optimization. These benchmarks enable us to provide recommendations for more effective design of biological sequences using machine learning.Author summary: Protein engineering has previously benefited from the use of machine learning models to guide the choice of new experiments. In many cases, the goal of conducting new experiments is optimizing for a property or improving the machine learning model. Many standard methods for these two tasks require good estimates of the uncertainty in the model’s predictions. Several methods for quantifying this uncertainty exist and have been benchmarked on datasets from other domains (e.g. small molecules), but it is not clear whether these results also apply for proteins. To address this, we evaluated a range of uncertainty quantification approaches on tasks derived from a protein-focused benchmark dataset. We tested performance on different degrees of distributional shift between the training and testing sets and on different representations of the sequences, and we assessed performance in terms of several standard metrics. Finally, we used the uncertainties for property optimization and model improvement. Our findings indicate that no single uncertainty estimation method excels across all scenarios. Moreover, uncertainty-based strategies for property optimization often did not outperform simpler methods that did not consider uncertainty. This research offers insights for the more efficacious application of machine learning in the realm of biological sequence design.

Suggested Citation

Kevin P Greenman & Ava P Amini & Kevin K Yang, 2025. "Benchmarking uncertainty quantification for protein engineering," PLOS Computational Biology, Public Library of Science, vol. 21(1), pages 1-19, January.

Handle: RePEc:plo:pcbi00:1012639
DOI: 10.1371/journal.pcbi.1012639

Download full text from publisher

References listed on IDEAS

Gneiting, Tilmann & Raftery, Adrian E., 2007. "Strictly Proper Scoring Rules, Prediction, and Estimation," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 359-378, March.
Claire N Bedbrook & Kevin K Yang & Austin J Rice & Viviana Gradinaru & Frances H Arnold, 2017. "Machine learning to design integral membrane channelrhodopsins for efficient eukaryotic expression and plasma membrane localization," PLOS Computational Biology, Public Library of Science, vol. 13(10), pages 1-21, October.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Saeed Hayati & Kenji Fukumizu & Afshin Parvardeh, 2024. "Kernel mean embedding of probability measures and its applications to functional data analysis," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 51(2), pages 447-484, June.
Azar, Pablo D. & Micali, Silvio, 2018. "Computational principal agent problems," Theoretical Economics, Econometric Society, vol. 13(2), May.
Luis A Barboza & Shu-Wei Chou-Chen & Paola Vásquez & Yury E García & Juan G Calvo & Hugo G Hidalgo & Fabio Sanchez, 2023. "Assessing dengue fever risk in Costa Rica by using climate variables and machine learning techniques," PLOS Neglected Tropical Diseases, Public Library of Science, vol. 17(1), pages 1-13, January.
Angelica Gianfreda & Francesco Ravazzolo & Luca Rossini, 2023. "Large Time‐Varying Volatility Models for Hourly Electricity Prices," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 85(3), pages 545-573, June.
Tobias Fissler & Yannick Hoga, 2024. "How to Compare Copula Forecasts?," Papers 2410.04165, arXiv.org.
Davide Pettenuzzo & Francesco Ravazzolo, 2016. "Optimal Portfolio Choice Under Decision‐Based Model Combinations," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 31(7), pages 1312-1332, November.
- Davide Pettenuzzo & Francesco Ravazzolo, 2014. "Optimal Portfolio Choice under Decision-Based Model Combinations," Working Papers 80, Brandeis University, Department of Economics and International Business School.
- Davide Pettenuzzo & Francesco Ravazzolo, 2015. "Optimal Portfolio Choice under Decision-Based Model Combinations," Working Papers No 9/2015, Centre for Applied Macro- and Petroleum economics (CAMP), BI Norwegian Business School.
- Davide Pettenuzzo & Francesco Ravazzolo, 2014. "Optimal portfolio choice under decision-based model combinations," Working Paper 2014/15, Norges Bank.
Rubio, F.J. & Steel, M.F.J., 2011. "Inference for grouped data with a truncated skew-Laplace distribution," Computational Statistics & Data Analysis, Elsevier, vol. 55(12), pages 3218-3231, December.
Keijsers, Bart & van Dijk, Dick, 2025. "Does economic uncertainty predict real activity in real time?," International Journal of Forecasting, Elsevier, vol. 41(2), pages 748-762.
- Bart Keijsers & Dick van Dijk, 2022. "Does economic uncertainty predict real activity in real-time?," Tinbergen Institute Discussion Papers 22-069/III, Tinbergen Institute, revised 01 Mar 2023.
Basora, Luis & Viens, Arthur & Chao, Manuel Arias & Olive, Xavier, 2025. "A benchmark on uncertainty quantification for deep learning prognostics," Reliability Engineering and System Safety, Elsevier, vol. 253(C).
Hwang, Eunju, 2022. "Prediction intervals of the COVID-19 cases by HAR models with growth rates and vaccination rates in top eight affected countries: Bootstrap improvement," Chaos, Solitons & Fractals, Elsevier, vol. 155(C).
R de Fondeville & A C Davison, 2018. "High-dimensional peaks-over-threshold inference," Biometrika, Biometrika Trust, vol. 105(3), pages 575-592.
Armantier, Olivier & Treich, Nicolas, 2013. "Eliciting beliefs: Proper scoring rules, incentives, stakes and hedging," European Economic Review, Elsevier, vol. 62(C), pages 17-40.
- Armantier, Olivier & Treich, Nicolas, 2010. "Eliciting Beliefs: Proper Scoring Rules, Incentives, Stakes and Hedging," IDEI Working Papers 643, Institut d'Économie Industrielle (IDEI), Toulouse.
- Armantier, Olivier & Treich, Nicolas, 2010. "Eliciting Beliefs: Proper Scoring Rules, Incentives, Stakes and Hedging," LERNA Working Papers 10.26.332, LERNA, University of Toulouse.
- Armantier, Olivier & Treich, Nicolas, 2010. "Eliciting Beliefs: Proper Scoring Rules, Incentives, Stakes and Hedging," TSE Working Papers 10-213, Toulouse School of Economics (TSE).
- Armantier, Olivier & Treich, Nicolas, 2010. "Eliciting Beliefs: Proper Scoring Rules, Incentives, Stakes and Hedging," TSE Working Papers 10-156, Toulouse School of Economics (TSE).
Domenico Piccolo & Rosaria Simone, 2019. "The class of cub models: statistical foundations, inferential issues and empirical evidence," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 28(3), pages 389-435, September.
Finn Lindgren, 2015. "Comments on: Comparing and selecting spatial predictors using local criteria," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 24(1), pages 35-44, March.
Chuliá, Helena & Garrón, Ignacio & Uribe, Jorge M., 2024. "Daily growth at risk: Financial or real drivers? The answer is not always the same," International Journal of Forecasting, Elsevier, vol. 40(2), pages 762-776.
- Helena Chuliá & Ignacio Garrón & Jorge M. Uribe, 2022. ""Daily Growth at Risk: financial or real drivers? The answer is not always the same"," IREA Working Papers 202208, University of Barcelona, Research Institute of Applied Economics, revised Jun 2022.
Kelly Trinh & Bo Zhang & Chenghan Hou, 2025. "Macroeconomic real‐time forecasts of univariate models with flexible error structures," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 44(1), pages 59-78, January.
Laura Liu & Hyungsik Roger Moon & Frank Schorfheide, 2023. "Forecasting with a panel Tobit model," Quantitative Economics, Econometric Society, vol. 14(1), pages 117-159, January.
- Laura Liu & Hyungsik Roger Moon & Frank Schorfheide, 2019. "Forecasting with a Panel Tobit Model," CAEPR Working Papers 2019-005, Center for Applied Economics and Policy Research, Department of Economics, Indiana University Bloomington.
- Laura Liu & Hyungsik Roger Moon & Frank Schorfheide, 2019. "Forecasting with a Panel Tobit Model," NBER Working Papers 26569, National Bureau of Economic Research, Inc.
- Laura Liu & Hyungsik Roger Moon & Frank Schorfheide, 2021. "Forecasting with a Panel Tobit Model," Papers 2110.14117, arXiv.org, revised Jul 2022.
Warne, Anders, 2023. "DSGE model forecasting: rational expectations vs. adaptive learning," Working Paper Series 2768, European Central Bank.
James Mitchell & Aubrey Poon & Dan Zhu, 2024. "Constructing density forecasts from quantile regressions: Multimodality in macrofinancial dynamics," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 39(5), pages 790-812, August.
- James Mitchell & Aubrey Poon & Dan Zhu, 2022. "Constructing Density Forecasts from Quantile Regressions: Multimodality in Macro-Financial Dynamics," Working Papers 22-12R, Federal Reserve Bank of Cleveland, revised 11 Apr 2023.
Rafael Frongillo, 2022. "Quantum Information Elicitation," Papers 2203.07469, arXiv.org.

More about this item

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1012639. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Benchmarking uncertainty quantification for protein engineering

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data