IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2512.21080.html

LLM Personas as a Substitute for Field Experiments in Method Benchmarking

Author

Listed:
  • Enoch Hyunwook Kang

Abstract

Field experiments (A/B tests) are often the most credible benchmark for methods (algorithms) in societal systems, but their cost and latency bottleneck rapid methodological progress. LLM-based persona simulation offers a cheap synthetic alternative, yet it is unclear whether replacing humans with personas preserves the benchmark interface that adaptive methods optimize against. We prove an if-and-only-if characterization: when (i) methods observe only the aggregate outcome (aggregate-only observation) and (ii) evaluation depends only on the submitted artifact and not on the method's identity or provenance (method-blind evaluation), swapping humans for personas is just panel change from the method's point of view, indistinguishable from changing the evaluation population (e.g., New York to Jakarta). Furthermore, we move from validity to usefulness: we define an information-theoretic discriminability of the induced aggregate channel and show that making persona benchmarking as decision-relevant as a field experiment is fundamentally a sample-size question, yielding explicit bounds on the number of independent persona evaluations required to reliably distinguish meaningfully different methods at a chosen resolution.

Suggested Citation

  • Enoch Hyunwook Kang, 2025. "LLM Personas as a Substitute for Field Experiments in Method Benchmarking," Papers 2512.21080, arXiv.org, revised Jan 2026.
  • Handle: RePEc:arx:papers:2512.21080
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2512.21080
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. George Gui & Seungwoo Kim, 2025. "Leveraging LLMs to Improve Experimental Design: A Generative Stratification Approach," Papers 2509.25709, arXiv.org.
    2. Oriana Bandiera & Iwan Barankay & Imran Rasul, 2011. "Field Experiments with Firms," Journal of Economic Perspectives, American Economic Association, vol. 25(3), pages 63-82, Summer.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Guido Friebel & Matthias Heinz & Miriam Krueger & Nikolay Zubanov, 2017. "Team Incentives and Performance: Evidence from a Retail Chain," American Economic Review, American Economic Association, vol. 107(8), pages 2168-2203, August.
    2. Pedro Carneiro & Sokbae Lee & Daniel Wilhelm, 2020. "Optimal data collection for randomized control trials," The Econometrics Journal, Royal Economic Society, vol. 23(1), pages 1-31.
    3. Ethan Ilzetzki & Saverio Simonelli, 2017. "Measuring Productivity Dispersion: Lessons From Counting One-Hundred Million Ballots," CSEF Working Papers 483, Centre for Studies in Economics and Finance (CSEF), University of Naples, Italy.
    4. Petri Böckerman & Alex Bryson & Pekka Ilmakunnas, 2013. "Does high involvement management lead to higher pay?," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 176(4), pages 861-885, October.
    5. Stijn Baert & Sunčica Vujić, 2018. "Does it pay to care? Volunteering and employment opportunities," Journal of Population Economics, Springer;European Society for Population Economics, vol. 31(3), pages 819-836, July.
    6. Kathrin Manthei & Dirk Sliwka & Timo Vogelsang, 2021. "Performance Pay and Prior Learning—Evidence from a Retail Chain," Management Science, INFORMS, vol. 67(11), pages 6998-7022, November.
    7. Eric Floyd & John A. List, 2016. "Using Field Experiments in Accounting and Finance," Journal of Accounting Research, John Wiley & Sons, Ltd., vol. 54(2), pages 437-475, May.
    8. Abebe, Girum & Caria, Stefano & Fafchamps, Marcel & Falco, Paolo & Franklin, Simon & Quinn, Simon & Shilpi, Forhad, 2017. "Matching firms and workers in a field experiment in Ethiopia," LSE Research Online Documents on Economics 86572, London School of Economics and Political Science, LSE Library.
    9. Manthei, Kathrin & Sliwka, Dirk & Vogelsang, Timo, 2017. "Performance Pay May Not Raise Performance – A Cautionary Tale Based On Evidence from Large Scale Field Experiments in a Retail Chain," VfS Annual Conference 2017 (Vienna): Alternative Structures for Money and Banking 168287, Verein für Socialpolitik / German Economic Association.
    10. Olivier Armantier & Amadou Boly, 2015. "Framing Of Incentives And Effort Provision," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 56(3), pages 917-938, August.
    11. Greer K. Gosnell & John A. List & Robert Metcalfe, 2016. "A New Approach to an Age-Old Problem: Solving Externalities by Incenting Workers Directly," NBER Working Papers 22316, National Bureau of Economic Research, Inc.
    12. Paweł Doligalski & Abdoulaye Ndiaye & Nicolas Werquin, 2023. "Redistribution with Performance Pay," Journal of Political Economy Macroeconomics, University of Chicago Press, vol. 1(2), pages 371-402.
    13. Simon Wiederhold, 2012. "The Role of Public Procurement in Innovation: Theory and Empirical Evidence," ifo Beiträge zur Wirtschaftsforschung, ifo Institute - Leibniz Institute for Economic Research at the University of Munich, number 43.
    14. Marcel Fafchamps & Simon Quinn, 2018. "Networks and Manufacturing Firms in Africa: Results from a Randomized Field Experiment," The World Bank Economic Review, World Bank, vol. 32(3), pages 656-675.
    15. John Pencavel, 2013. "The Productivity Of Working Hours," Discussion Papers 13-006, Stanford Institute for Economic Policy Research.
    16. Robert Gibbons & John Roberts, 2012. "Introduction [The Handbook of Organizational Economics]," Introductory Chapters,, Princeton University Press.
    17. Loureiro, Maria & Labandeira, Xavier, 2019. "Exploring Energy Use in Retail Stores: A Field Experiment," Energy Economics, Elsevier, vol. 84(S1).
    18. Joshua Graff Zivin & Lisa B. Kahn & Matthew Neidell, 2021. "Incentivizing Learning-by-Doing: The Role of Compensation Schemes," Research in Labor Economics, in: Workplace Productivity and Management Practices, volume 49, pages 139-178, Emerald Group Publishing Limited.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2512.21080. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.