How Humans Help LLMs: Assessing and Incentivizing Human Preference Annotators

How Humans Help LLMs: Assessing and Incentivizing Human Preference Annotators

Author

Listed:

Shang Liu
Hanzhao Wang
Zhongyao Ma
Xiaocheng Li

Abstract

Human-annotated preference data play an important role in aligning large language models (LLMs). In this paper, we study two connected questions: how to monitor the quality of human preference annotators and how to incentivize them to provide high-quality annotations. In current practice, expert-based monitoring is a natural workhorse for quality control, but it performs poorly in preference annotation because annotators are heterogeneous and downstream model performance is an indirect and noisy proxy for annotation quality. We therefore propose a self-consistency monitoring scheme tailored to preference annotation, and analyze the statistical sample complexity of both methods. This practitioner-facing analysis identifies how many inspected samples are needed to reliably assess an annotator and shows when self-consistency monitoring can outperform expert-based monitoring. We then use the resulting monitoring signal as the performance measure in a principal-agent model, which lets us study a second sample-complexity question: how many monitored samples are needed before simple contracts perform close to the ideal benchmark in which annotation quality is perfectly observable. Under this continuous action space, we show that this shortfall scales as $\Theta(1/\sqrt{\mathcal{I} n \log n})$ for binary contracts and $\Theta(1/(\mathcal{I}n))$ for linear contracts, where $\mathcal{I}$ is the Fisher information and $n$ is the number of samples; we further show that the linear contracts are rate-optimal among general contracts. This contrasts with the known result that binary contracts are optimal and of $\exp(-\Theta(n))$ when the action space is discrete \citep{frick2023monitoring}.

Suggested Citation

Shang Liu & Hanzhao Wang & Zhongyao Ma & Xiaocheng Li, 2025. "How Humans Help LLMs: Assessing and Incentivizing Human Preference Annotators," Papers 2502.06387, arXiv.org, revised Apr 2026.

Handle: RePEc:arx:papers:2502.06387

Download full text from publisher

References listed on IDEAS

Fabian Herweg & Daniel Muller & Philipp Weinschenk, 2010. "Binary Payment Schemes: Moral Hazard and Loss Aversion," American Economic Review, American Economic Association, vol. 100(5), pages 2451-2477, December.
- Herweg, Fabian & Müller, Daniel & Weinschenk, Philipp, 2010. "Binary payment schemes: Moral hazard and loss aversion," Munich Reprints in Economics 19450, University of Munich, Department of Economics.
- Fabian Herweg & Daniel Müller & Philipp Weinschenk, 2010. "Binary Payment Schemes: Moral Hazard and Loss Aversion," Discussion Paper Series of the Max Planck Institute for Behavioral Economics 2010_38, Max Planck Institute for Behavioral Economics.
Barron, Daniel & Georgiadis, George & Swinkels, Jeroen M., 2020. "Optimal contracts with a risk-taking agent," Theoretical Economics, Econometric Society, vol. 15(2), May.
Elodie Adida & Fernanda Bravo, 2019. "Contracts for Healthcare Referral Services: Coordination via Outcome-Based Penalty Contracts," Management Science, INFORMS, vol. 65(3), pages 1322-1341, March.
Daniel Walton & Gabriel Carroll, 2022. "A General Framework for Robust Contracting Models," Econometrica, Econometric Society, vol. 90(5), pages 2129-2159, September.
Singh, Nirvikar, 1985. "Monitoring and Hierarchies: The Marginal Value of Information in a Principal-Agent Model," Journal of Political Economy, University of Chicago Press, vol. 93(3), pages 599-609, June.
Nolan Miller & Paul Resnick & Richard Zeckhauser, 2005. "Eliciting Informative Feedback: The Peer-Prediction Method," Management Science, INFORMS, vol. 51(9), pages 1359-1373, September.
Holmstrom, Bengt & Milgrom, Paul, 1987. "Aggregation and Linearity in the Provision of Intertemporal Incentives," Econometrica, Econometric Society, vol. 55(2), pages 303-328, March.
- Bengt Holmstrom & Paul R. Milgrom, 1985. "Aggregation and Linearity in the Provision of Intertemporal Incentives," Cowles Foundation Discussion Papers 742, Cowles Foundation for Research in Economics, Yale University.
Kim, Son Ku, 1995. "Efficiency of an Information System in an Agency Model," Econometrica, Econometric Society, vol. 63(1), pages 89-102, January.
Nitish Jain & Sameer Hasija & Dana G. Popescu, 2013. "Optimal Contracts for Outsourcing of Repair and Restoration Services," Operations Research, INFORMS, vol. 61(6), pages 1295-1311, December.
Joann F. de Zegher & Dan A. Iancu & Hau L. Lee, 2019. "Designing Contracts and Sourcing Channels to Create Shared Value," Manufacturing & Service Operations Management, INFORMS, vol. 21(2), pages 271-289, May.
Giuseppe Moscarini & Lones Smith, 2002. "The Law of Large Demand for Information," Econometrica, Econometric Society, vol. 70(6), pages 2351-2366, November.
Harris, Milton & Raviv, Artur, 1979. "Optimal incentive contracts with imperfect information," Journal of Economic Theory, Elsevier, vol. 20(2), pages 231-259, April.
Daron Acemoglu & Ali Makhdoumi & Azarakhsh Malekian & Asu Ozdaglar, 2022. "Too Much Data: Prices and Inefficiencies in Data Markets," American Economic Journal: Microeconomics, American Economic Association, vol. 14(4), pages 218-256, November.
- Acemoglu, Daron & Makhdoumi, Ali & Ozdaglar, Asuman & Malekian, Azarakhsh, 2019. "Too Much Data: Prices and Inefficiencies in Data Markets," CEPR Discussion Papers 14225, Centre for Economic Policy Research.
- Daron Acemoglu & Ali Makhdoumi & Azarakhsh Malekian & Asuman Ozdaglar, 2019. "Too Much Data: Prices and Inefficiencies in Data Markets," NBER Working Papers 26296, National Bureau of Economic Research, Inc.
Dirk Bergemann & Alessandro Bonatti, 2019. "Markets for Information: An Introduction," Annual Review of Economics, Annual Reviews, vol. 11(1), pages 85-107, August.
- Dirk Bergemann & Alessandro Bonatti, 2018. "Markets for Information: An Introduction," Cowles Foundation Discussion Papers 2142, Cowles Foundation for Research in Economics, Yale University.
- Bergemann, Dirk & Bonatti, Alessandro, 2018. "Markets for Information: An Introduction," CEPR Discussion Papers 13148, Centre for Economic Policy Research.
Corbett, Charles J. & DeCroix, Gregory A. & Ha, Albert Y., 2005. "Optimal shared-savings contracts in supply chains: Linear contracts and double moral hazard," European Journal of Operational Research, Elsevier, vol. 163(3), pages 653-667, June.
Gabriel Carroll, 2015. "Robustness and Linear Contracts," American Economic Review, American Economic Association, vol. 105(2), pages 536-563, February.
Lopomo, Giuseppe & Rigotti, Luca & Shannon, Chris, 2011. "Knightian uncertainty and moral hazard," Journal of Economic Theory, Elsevier, vol. 146(3), pages 1148-1172, May.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

George Georgiadis & Balazs Szentes, 2020. "Optimal Monitoring Design," Econometrica, Econometric Society, vol. 88(5), pages 2075-2107, September.
Rosenthal, Maxwell, 2023. "Robust incentives for risk," Journal of Mathematical Economics, Elsevier, vol. 109(C).
Burkett, Justin & Rosenthal, Maxwell, 2024. "Statistical uncertainty and coarse contracts," Journal of Economic Theory, Elsevier, vol. 220(C).
Carroll, Gabriel & Bolte, Lukas, 2023. "Robust contracting under double moral hazard," Theoretical Economics, Econometric Society, vol. 18(4), November.
Inés Macho-Stadler & David Pérez-Castrillo, 2018. "Moral hazard: Base models and two extensions," Chapters, in: Luis C. Corchón & Marco A. Marini (ed.), Handbook of Game Theory and Industrial Organization, Volume I, chapter 16, pages 453-485, Edward Elgar Publishing.
- David Pérez-Castrillo & Inés Macho-Stadler, 2016. "Moral Hazard: Base Models and Two Extensions," Working Papers 883, Barcelona School of Economics.
- Ines Macho-Stadler & David Pérez-Castrillo, 2016. "Moral Hazard: Base Models and Two Extensions," CESifo Working Paper Series 5851, CESifo.
Paul Dütting & Michal Feldman & Daniel Peretz & Larry Samuelson, 2024. "Ambiguous Contracts," Econometrica, Econometric Society, vol. 92(6), pages 1967-1992, November.
Qian, Cheng & Li, Zhaolin & Fu, Qi, 2026. "Managing inventory and financing decisions under ambiguity," Omega, Elsevier, vol. 140(C).
Xianyi Wang & Xiaofang Wang & Hui He, 2021. "Contracts to Coordinate Healthcare Providers in the Telemedicine Referral System," Sustainability, MDPI, vol. 13(18), pages 1-25, September.
Matsushima, Hitoshi & Noda, Shunya, 2023. "Mechanism design with general ex-ante investments," Journal of Mathematical Economics, Elsevier, vol. 106(C).
- Hitoshi Matsushima & Shunya Noda, 2019. "Mechanism Design with General Ex-Ante Investments," CIRJE F-Series CIRJE-F-1124, CIRJE, Faculty of Economics, University of Tokyo.
Peter Zhang, 2023. "Distributionally Robust Principal-Agent Problems and Optimality of Contracts," Papers 2303.07468, arXiv.org, revised Jan 2024.
Hitoshi Matsushima & Shunya Noda, 2019. "Mechanism Design with General Ex-Ante Investments (Revised version of F415 )," CARF F-Series CARF-F-464, Center for Advanced Research in Finance, Faculty of Economics, The University of Tokyo.
Hitoshi Matsushima & Shunya Noda, 2017. "Mechanism Design in Hidden Action and Hidden Information: Richness and Pure-VCG," CIRJE F-Series CIRJE-F-1057, CIRJE, Faculty of Economics, University of Tokyo.
Hitoshi Matsushima & Shunya Noda, 2016. "Mechanism Design in Hidden Action and Hidden Information: Richness and Pure Groves," CARF F-Series CARF-F-386, Center for Advanced Research in Finance, Faculty of Economics, The University of Tokyo.
- Hitoshi Matsushima & Shunya Noda, 2016. "Mechanism Design in Hidden Action and Hidden Information: Richness and Pure Groves," CIRJE F-Series CIRJE-F-1015, CIRJE, Faculty of Economics, University of Tokyo.
Bartsch, Elga, 1996. "Enforcement of environmental liability in the case of uncertain causality and asymmetric information," Kiel Working Papers 755, Kiel Institute for the World Economy.
Atalay Atasu & Dragos Florin Ciocan & Antoine Désir, 2024. "Price Delegation with Learning Agents," Management Science, INFORMS, vol. 70(8), pages 5540-5556, August.
Lilia Filipova, 2007. "Monitoring and Privacy in Automobile Insurance Markets with Moral Hazard," Discussion Paper Series 293, Universitaet Augsburg, Institute for Economics.
Tal Alon & Paul Dutting & Yingkai Li & Inbal Talgam-Cohen, 2022. "Approximate Optimality of Linear Contracts Under Uncertainty," Papers 2211.06850, arXiv.org, revised Mar 2025.
Paul Duetting & Michal Feldman & Inbal Talgam-Cohen, 2024. "Algorithmic Contract Theory: A Survey," Papers 2412.16384, arXiv.org.
Kim, Son Ku & Wang, Susheng, 1998. "Linear Contracts and the Double Moral-Hazard," Journal of Economic Theory, Elsevier, vol. 82(2), pages 342-378, October.
Martin Dumav, 2021. "Moral Hazard, Dynamic Incentives, and Ambiguous Perceptions," Papers 2110.15229, arXiv.org.

More about this item

NEP fields

This paper has been announced in the following NEP Reports:

NEP-AIN-2025-03-17 (Artificial Intelligence)
NEP-CMP-2025-03-17 (Computational Economics)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2502.06387. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

How Humans Help LLMs: Assessing and Incentivizing Human Preference Annotators

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data