IDEAS home Printed from https://ideas.repec.org/p/fip/fedgfe/2025-44.html
   My bibliography  Save this paper

Total Recall? Evaluating the Macroeconomic Knowledge of Large Language Models

Author

Abstract

We evaluate the ability of large language models (LLMs) to estimate historical macroeconomic variables and data release dates. We find that LLMs have precise knowledge of some recent statistics, but performance degrades as we go farther back in history. We highlight two particularly important kinds of recall errors: mixing together first print data with subsequent revisions (i.e., smoothing across vintages) and mixing data for past and future reference periods (i.e., smoothing within vintages). We also find that LLMs can often recall individual data release dates accurately, but aggregating across series shows that on any given day the LLM is likely to believe it has data in hand which has not been released. Our results indicate that while LLMs have impressively accurate recall, their errors point to some limitations when used for historical analysis or to mimic real time forecasters.

Suggested Citation

  • Leland D. Crane & Akhil Karra & Paul E. Soto, 2025. "Total Recall? Evaluating the Macroeconomic Knowledge of Large Language Models," Finance and Economics Discussion Series 2025-044, Board of Governors of the Federal Reserve System (U.S.).
  • Handle: RePEc:fip:fedgfe:2025-44
    DOI: 10.17016/FEDS.2025.044
    as

    Download full text from publisher

    File URL: https://www.federalreserve.gov/econres/feds/files/2025044pap.pdf
    Download Restriction: no

    File URL: https://libkey.io/10.17016/FEDS.2025.044?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Paul Glasserman & Caden Lin, 2023. "Assessing Look-Ahead Bias in Stock Return Predictions Generated By GPT Sentiment Analysis," Papers 2309.17322, arXiv.org.
    2. Alejandro Lopez-Lira & Yuehua Tang & Mingyin Zhu, 2025. "The Memorization Problem: Can We Trust LLMs' Economic Forecasts?," Papers 2504.14765, arXiv.org.
    3. Van Pham & Scott Cunningham, 2024. "Can Base ChatGPT be Used for Forecasting without Additional Optimization?," Papers 2404.07396, arXiv.org, revised Jul 2024.
    4. Anton Korinek, 2023. "Generative AI for Economic Research: Use Cases and Implications for Economists," Journal of Economic Literature, American Economic Association, vol. 61(4), pages 1281-1317, December.
    5. Benjamin S. Manning & Kehang Zhu & John J. Horton, 2024. "Automated Social Science: Language Models as Scientist and Subjects," Papers 2404.11794, arXiv.org, revised Apr 2024.
    6. Benjamin S. Manning & Kehang Zhu & John J. Horton, 2024. "Automated Social Science: Language Models as Scientist and Subjects," NBER Working Papers 32381, National Bureau of Economic Research, Inc.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Alejandro Lopez-Lira & Yuehua Tang & Mingyin Zhu, 2025. "The Memorization Problem: Can We Trust LLMs' Economic Forecasts?," Papers 2504.14765, arXiv.org.
    2. Dong, Mengming Michael & Stratopoulos, Theophanis C. & Wang, Victor Xiaoqi, 2024. "A scoping review of ChatGPT research in accounting and finance," International Journal of Accounting Information Systems, Elsevier, vol. 55(C).
    3. Sophia Kazinnik & Tara M. Sinclair, 2025. "FOMC In Silico: A Multi-Agent System for Monetary Policy Decision Modeling," Working Papers 2025-005, The George Washington University, The Center for Economic Research.
    4. Kevin He & Ran Shorrer & Mengjia Xia, 2025. "Human Misperception of Generative-AI Alignment:A Laboratory Experiment," PIER Working Paper Archive 25-019, Penn Institute for Economic Research, Department of Economics, University of Pennsylvania.
    5. Benjamin S. Manning & John J. Horton, 2025. "General Social Agents," Papers 2508.17407, arXiv.org, revised Sep 2025.
    6. Yikai Zhao & Jun Nagayasu & Xinyi Geng, 2024. "Measuring Climate Policy Uncertainty with LLMs: New Insights into Corporate Bond Credit Spreads," DSSR Discussion Papers 143, Graduate School of Economics and Management, Tohoku University.
    7. Matthew O. Jackson & Qiaozhu Me & Stephanie W. Wang & Yutong Xie & Walter Yuan & Seth Benzell & Erik Brynjolfsson & Colin F. Camerer & James Evans & Brian Jabarian & Jon Kleinberg & Juanjuan Meng & Se, 2025. "AI Behavioral Science," Papers 2509.13323, arXiv.org.
    8. So Kuroki & Yingtao Tian & Kou Misaki & Takashi Ikegami & Takuya Akiba & Yujin Tang, 2025. "Reimagining Agent-based Modeling with Large Language Model Agents via Shachi," Papers 2509.21862, arXiv.org, revised Oct 2025.
    9. Sugat Chaturvedi & Rochana Chaturvedi, 2025. "Who Gets the Callback? Generative AI and Gender Bias," Papers 2504.21400, arXiv.org.
    10. Alexander Erlei, 2025. "From Digital Distrust to Codified Honesty: Experimental Evidence on Generative AI in Credence Goods Markets," Papers 2509.06069, arXiv.org.
    11. Alejandro Lopez-Lira, 2025. "Can Large Language Models Trade? Testing Financial Theories with LLM Agents in Market Simulations," Papers 2504.10789, arXiv.org.
    12. Jian-Qiao Zhu & Haijiang Yan & Thomas L. Griffiths, 2024. "Language Models Trained to do Arithmetic Predict Human Risky and Intertemporal Choice," Papers 2405.19313, arXiv.org, revised May 2025.
    13. Felipe A. Csaszar & Harsh Ketkar & Hyunjin Kim, 2024. "Artificial Intelligence and Strategic Decision-Making: Evidence from Entrepreneurs and Investors," Papers 2408.08811, arXiv.org.
    14. Francesco Venturini, 2025. "Generative AI and Income Growth: Early Evidence on Global Data," Gospodarka Narodowa. The Polish Journal of Economics, Warsaw School of Economics, issue 3, pages 31-46.
    15. Can Celebi & Stefan Penczynski, 2024. "Using Large Language Models for Text Classification in Experimental Economics," Working Paper series, University of East Anglia, Centre for Behavioural and Experimental Social Science (CBESS) 24-01, School of Economics, University of East Anglia, Norwich, UK..
    16. Hui Chen & Antoine Didisheim & Luciano Somoza & Hanqing Tian, 2025. "A Financial Brain Scan of the LLM," Papers 2508.21285, arXiv.org.
    17. Manish Jha & Jialin Qian & Michael Weber & Baozhong Yang, 2024. "Generative AI, Managerial Expectations, and Economic Activity," Papers 2410.03897, arXiv.org, revised Nov 2025.
    18. Alex Kim & Maximilian Muhn & Valeri Nikolaev, 2024. "Financial Statement Analysis with Large Language Models," Papers 2407.17866, arXiv.org, revised Feb 2025.
    19. Julian Junyan Wang & Victor Xiaoqi Wang, 2025. "Assessing Consistency and Reproducibility in the Outputs of Large Language Models: Evidence Across Diverse Finance and Accounting Tasks," Papers 2503.16974, arXiv.org, revised Sep 2025.
    20. Stefania Albanesi & António Dias da Silva & Juan F Jimeno & Ana Lamo & Alena Wabitsch, 2025. "New technologies and jobs in Europe," Economic Policy, CEPR, CESifo, Sciences Po;CES;MSH, vol. 40(121), pages 71-139.

    More about this item

    Keywords

    ;
    ;
    ;
    ;

    JEL classification:

    • C53 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Forecasting and Prediction Models; Simulation Methods
    • C80 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - General
    • E37 - Macroeconomics and Monetary Economics - - Prices, Business Fluctuations, and Cycles - - - Forecasting and Simulation: Models and Applications

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:fip:fedgfe:2025-44. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Ryan Wolfslayer ; Keisha Fournillier (email available below). General contact details of provider: https://edirc.repec.org/data/frbgvus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.