IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0300024.html
   My bibliography  Save this article

Framework-based qualitative analysis of free responses of Large Language Models: Algorithmic fidelity

Author

Listed:
  • Aliya Amirova
  • Theodora Fteropoulli
  • Nafiso Ahmed
  • Martin R Cowie
  • Joel Z Leibo

Abstract

Today, with the advent of Large-scale generative Language Models (LLMs) it is now possible to simulate free responses to interview questions such as those traditionally analyzed using qualitative research methods. Qualitative methodology encompasses a broad family of techniques involving manual analysis of open-ended interviews or conversations conducted freely in natural language. Here we consider whether artificial “silicon participants” generated by LLMs may be productively studied using qualitative analysis methods in such a way as to generate insights that could generalize to real human populations. The key concept in our analysis is algorithmic fidelity, a validity concept capturing the degree to which LLM-generated outputs mirror human sub-populations’ beliefs and attitudes. By definition, high algorithmic fidelity suggests that latent beliefs elicited from LLMs may generalize to real humans, whereas low algorithmic fidelity renders such research invalid. Here we used an LLM to generate interviews with “silicon participants” matching specific demographic characteristics one-for-one with a set of human participants. Using framework-based qualitative analysis, we showed the key themes obtained from both human and silicon participants were strikingly similar. However, when we analyzed the structure and tone of the interviews we found even more striking differences. We also found evidence of a hyper-accuracy distortion. We conclude that the LLM we tested (GPT-3.5) does not have sufficient algorithmic fidelity to expect in silico research on it to generalize to real human populations. However, rapid advances in artificial intelligence raise the possibility that algorithmic fidelity may improve in the future. Thus we stress the need to establish epistemic norms now around how to assess the validity of LLM-based qualitative research, especially concerning the need to ensure the representation of heterogeneous lived experiences.

Suggested Citation

  • Aliya Amirova & Theodora Fteropoulli & Nafiso Ahmed & Martin R Cowie & Joel Z Leibo, 2024. "Framework-based qualitative analysis of free responses of Large Language Models: Algorithmic fidelity," PLOS ONE, Public Library of Science, vol. 19(3), pages 1-33, March.
  • Handle: RePEc:plo:pone00:0300024
    DOI: 10.1371/journal.pone.0300024
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0300024
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0300024&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0300024?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. John J. Horton, 2023. "Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus?," NBER Working Papers 31122, National Bureau of Economic Research, Inc.
    2. Murray Shanahan & Kyle McDonell & Laria Reynolds, 2023. "Role play with large language models," Nature, Nature, vol. 623(7987), pages 493-498, November.
    3. Argyle, Lisa P. & Busby, Ethan C. & Fulda, Nancy & Gubler, Joshua R. & Rytting, Christopher & Wingate, David, 2023. "Out of One, Many: Using Language Models to Simulate Human Samples," Political Analysis, Cambridge University Press, vol. 31(3), pages 337-351, July.
    4. John J. Horton, 2023. "Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus?," Papers 2301.07543, arXiv.org.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. repec:osf:osfxxx:udz28_v1 is not listed on IDEAS
    2. repec:osf:osfxxx:r3qng_v1 is not listed on IDEAS
    3. Navid Ghaffarzadegan & Aritra Majumdar & Ross Williams & Niyousha Hosseinichimeh, 2024. "Generative agent‐based modeling: an introduction and tutorial," System Dynamics Review, System Dynamics Society, vol. 40(1), January.
    4. Holtdirk, Tobias & Assenmacher, Dennis & Bleier, Arnim & Wagner, Claudia, 2024. "Fine-Tuning Large Language Models to Simulate German Voting Behaviour (Working Paper)," OSF Preprints udz28, Center for Open Science.
    5. Chen Gao & Xiaochong Lan & Nian Li & Yuan Yuan & Jingtao Ding & Zhilun Zhou & Fengli Xu & Yong Li, 2024. "Large language models empowered agent-based modeling and simulation: a survey and perspectives," Palgrave Communications, Palgrave Macmillan, vol. 11(1), pages 1-24, December.
    6. Niyousha Hosseinichimeh & Aritra Majumdar & Ross Williams & Navid Ghaffarzadegan, 2024. "From text to map: a system dynamics bot for constructing causal loop diagrams," System Dynamics Review, System Dynamics Society, vol. 40(3), July.
    7. Kevin Leyton-Brown & Paul Milgrom & Neil Newman & Ilya Segal, 2024. "Artificial Intelligence and Market Design: Lessons Learned from Radio Spectrum Reallocation," NBER Chapters, in: New Directions in Market Design, National Bureau of Economic Research, Inc.
    8. Capra, C. Monica & Kniesner, Thomas J., 2025. "Daniel Kahneman’s Underappreciated Last Published Paper: Empirical Implications for Benefit-Cost Analysis and a Chat Session Discussion with Bots," IZA Discussion Papers 17841, Institute of Labor Economics (IZA).
    9. Kirshner, Samuel N., 2024. "GPT and CLT: The impact of ChatGPT's level of abstraction on consumer recommendations," Journal of Retailing and Consumer Services, Elsevier, vol. 76(C).
    10. Zengqing Wu & Run Peng & Xu Han & Shuyuan Zheng & Yixin Zhang & Chuan Xiao, 2023. "Smart Agent-Based Modeling: On the Use of Large Language Models in Computer Simulations," Papers 2311.06330, arXiv.org, revised Dec 2023.
    11. Joshua C. Yang & Damian Dailisan & Marcin Korecki & Carina I. Hausladen & Dirk Helbing, 2024. "LLM Voting: Human Choices and AI Collective Decision Making," Papers 2402.01766, arXiv.org, revised Aug 2024.
    12. Nir Chemaya & Daniel Martin, 2023. "Perceptions and Detection of AI Use in Manuscript Preparation for Academic Journals," Papers 2311.14720, arXiv.org, revised Jan 2024.
    13. Nir Chemaya & Daniel Martin, 2024. "Perceptions and detection of AI use in manuscript preparation for academic journals," PLOS ONE, Public Library of Science, vol. 19(7), pages 1-16, July.
    14. Lijia Ma & Xingchen Xu & Yong Tan, 2024. "Crafting Knowledge: Exploring the Creative Mechanisms of Chat-Based Search Engines," Papers 2402.19421, arXiv.org.
    15. Ali Goli & Amandeep Singh, 2023. "Exploring the Influence of Language on Time-Reward Perceptions in Large Language Models: A Study Using GPT-3.5," Papers 2305.02531, arXiv.org, revised Jun 2023.
    16. Evangelos Katsamakas, 2024. "Business models for the simulation hypothesis," Papers 2404.08991, arXiv.org.
    17. Yuan Gao & Dokyun Lee & Gordon Burtch & Sina Fazelpour, 2024. "Take Caution in Using LLMs as Human Surrogates: Scylla Ex Machina," Papers 2410.19599, arXiv.org, revised Jan 2025.
    18. Jiaxin Liu & Yi Yang & Kar Yan Tam, 2025. "Evaluating and Aligning Human Economic Risk Preferences in LLMs," Papers 2503.06646, arXiv.org.
    19. Christoph Engel & Max R. P. Grossmann & Axel Ockenfels, 2023. "Integrating machine behavior into human subject experiments: A user-friendly toolkit and illustrations," Discussion Paper Series of the Max Planck Institute for Research on Collective Goods 2024_01, Max Planck Institute for Research on Collective Goods.
    20. Yiting Chen & Tracy Xiao Liu & You Shan & Songfa Zhong, 2023. "The emergence of economic rationality of GPT," Proceedings of the National Academy of Sciences, Proceedings of the National Academy of Sciences, vol. 120(51), pages 2316205120-, December.
    21. Ji Ma, 2025. "Steering Prosocial AI Agents: Computational Basis of LLM's Decision Making in Social Simulation," Papers 2504.11671, arXiv.org.
    22. Samuel Chang & Andrew Kennedy & Aaron Leonard & John A. List, 2024. "12 Best Practices for Leveraging Generative AI in Experimental Research," NBER Working Papers 33025, National Bureau of Economic Research, Inc.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0300024. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.