IDEAS home Printed from https://ideas.repec.org/p/fip/fedgfe/2025-44.html
   My bibliography  Save this paper

Total Recall? Evaluating the Macroeconomic Knowledge of Large Language Models

Author

Abstract

We evaluate the ability of large language models (LLMs) to estimate historical macroeconomic variables and data release dates. We find that LLMs have precise knowledge of some recent statistics, but performance degrades as we go farther back in history. We highlight two particularly important kinds of recall errors: mixing together first print data with subsequent revisions (i.e., smoothing across vintages) and mixing data for past and future reference periods (i.e., smoothing within vintages). We also find that LLMs can often recall individual data release dates accurately, but aggregating across series shows that on any given day the LLM is likely to believe it has data in hand which has not been released. Our results indicate that while LLMs have impressively accurate recall, their errors point to some limitations when used for historical analysis or to mimic real time forecasters.

Suggested Citation

  • Leland D. Crane & Akhil Karra & Paul E. Soto, 2025. "Total Recall? Evaluating the Macroeconomic Knowledge of Large Language Models," Finance and Economics Discussion Series 2025-044, Board of Governors of the Federal Reserve System (U.S.).
  • Handle: RePEc:fip:fedgfe:2025-44
    DOI: 10.17016/FEDS.2025.044
    as

    Download full text from publisher

    File URL: https://www.federalreserve.gov/econres/feds/files/2025044pap.pdf
    Download Restriction: no

    File URL: https://libkey.io/10.17016/FEDS.2025.044?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Paul Glasserman & Caden Lin, 2023. "Assessing Look-Ahead Bias in Stock Return Predictions Generated By GPT Sentiment Analysis," Papers 2309.17322, arXiv.org.
    2. Alejandro Lopez-Lira & Yuehua Tang & Mingyin Zhu, 2025. "The Memorization Problem: Can We Trust LLMs' Economic Forecasts?," Papers 2504.14765, arXiv.org.
    3. Van Pham & Scott Cunningham, 2024. "Can Base ChatGPT be Used for Forecasting without Additional Optimization?," Papers 2404.07396, arXiv.org, revised Jul 2024.
    4. Anton Korinek, 2023. "Generative AI for Economic Research: Use Cases and Implications for Economists," Journal of Economic Literature, American Economic Association, vol. 61(4), pages 1281-1317, December.
    5. Benjamin S. Manning & Kehang Zhu & John J. Horton, 2024. "Automated Social Science: Language Models as Scientist and Subjects," Papers 2404.11794, arXiv.org, revised Apr 2024.
    6. Benjamin S. Manning & Kehang Zhu & John J. Horton, 2024. "Automated Social Science: Language Models as Scientist and Subjects," NBER Working Papers 32381, National Bureau of Economic Research, Inc.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Alejandro Lopez-Lira & Yuehua Tang & Mingyin Zhu, 2025. "The Memorization Problem: Can We Trust LLMs' Economic Forecasts?," Papers 2504.14765, arXiv.org.
    2. Dong, Mengming Michael & Stratopoulos, Theophanis C. & Wang, Victor Xiaoqi, 2024. "A scoping review of ChatGPT research in accounting and finance," International Journal of Accounting Information Systems, Elsevier, vol. 55(C).
    3. Yikai Zhao & Jun Nagayasu & Xinyi Geng, 2024. "Measuring Climate Policy Uncertainty with LLMs: New Insights into Corporate Bond Credit Spreads," DSSR Discussion Papers 143, Graduate School of Economics and Management, Tohoku University.
    4. Sugat Chaturvedi & Rochana Chaturvedi, 2025. "Who Gets the Callback? Generative AI and Gender Bias," Papers 2504.21400, arXiv.org.
    5. Alejandro Lopez-Lira, 2025. "Can Large Language Models Trade? Testing Financial Theories with LLM Agents in Market Simulations," Papers 2504.10789, arXiv.org.
    6. Jian-Qiao Zhu & Haijiang Yan & Thomas L. Griffiths, 2024. "Language Models Trained to do Arithmetic Predict Human Risky and Intertemporal Choice," Papers 2405.19313, arXiv.org, revised May 2025.
    7. Felipe A. Csaszar & Harsh Ketkar & Hyunjin Kim, 2024. "Artificial Intelligence and Strategic Decision-Making: Evidence from Entrepreneurs and Investors," Papers 2408.08811, arXiv.org.
    8. Can Celebi & Stefan Penczynski, 2024. "Using Large Language Models for Text Classification in Experimental Economics," Working Paper series, University of East Anglia, Centre for Behavioural and Experimental Social Science (CBESS) 24-01, School of Economics, University of East Anglia, Norwich, UK..
    9. Manish Jha & Jialin Qian & Michael Weber & Baozhong Yang, 2024. "Harnessing Generative AI for Economic Insights," Papers 2410.03897, arXiv.org, revised Feb 2025.
    10. Alex Kim & Maximilian Muhn & Valeri Nikolaev, 2024. "Financial Statement Analysis with Large Language Models," Papers 2407.17866, arXiv.org, revised Feb 2025.
    11. Julian Junyan Wang & Victor Xiaoqi Wang, 2025. "Assessing Consistency and Reproducibility in the Outputs of Large Language Models: Evidence Across Diverse Finance and Accounting Tasks," Papers 2503.16974, arXiv.org, revised Jun 2025.
    12. Albanesi, Stefania & Dias da Silva, Antonio & Jimeno, Juan Francisco & Lamo, Ana & Wabitsch, Alena, 2023. "New Technologies and Jobs in Europe," CEPR Discussion Papers 18220, C.E.P.R. Discussion Papers.
    13. Shuaiyu Chen & T. Clifton Green & Huseyin Gulen & Dexin Zhou, 2024. "What Does ChatGPT Make of Historical Stock Returns? Extrapolation and Miscalibration in LLM Stock Return Forecasts," Papers 2409.11540, arXiv.org.
    14. Breitung, Christian & Müller, Sebastian, 2025. "Global Business Networks," Journal of Financial Economics, Elsevier, vol. 166(C).
    15. Wagner Marco, 2024. "Künstliche Intelligenz: ChatGPT bei EZB-Prognosen," Wirtschaftsdienst, Sciendo, vol. 104(9), pages 592-592.
    16. Kim Shin Young & Sang-Gun Lee & Ga Youn Hong, 2024. "User satisfaction with the service quality of ChatGPT," Service Business, Springer;Pan-Pacific Business Association, vol. 18(3), pages 417-431, December.
    17. Zareh Asatryan & Carlo Birkholz & Friedrich Heinemann, 2025. "Evidence-based policy or beauty contest? An LLM-based meta-analysis of EU cohesion policy evaluations," International Tax and Public Finance, Springer;International Institute of Public Finance, vol. 32(2), pages 625-655, April.
    18. Garg, Prashant & Fetzer, Thiemo, 2024. "Causal Claims in Economics," OSF Preprints u4vgs, Center for Open Science.
    19. Buchanan, Joy & Hickman, William, 2024. "Do people trust humans more than ChatGPT?," Journal of Behavioral and Experimental Economics (formerly The Journal of Socio-Economics), Elsevier, vol. 112(C).
    20. Fetzer, Thiemo & Lambert, Peter John & Feld, Bennet & Garg, Prashant, 2024. "AI-Generated Production Networks : Measurement and Applications to Global Trade," The Warwick Economics Research Paper Series (TWERPS) 1528, University of Warwick, Department of Economics.

    More about this item

    Keywords

    ;
    ;
    ;
    ;

    JEL classification:

    • C53 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Forecasting and Prediction Models; Simulation Methods
    • C80 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - General
    • E37 - Macroeconomics and Monetary Economics - - Prices, Business Fluctuations, and Cycles - - - Forecasting and Simulation: Models and Applications

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:fip:fedgfe:2025-44. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Ryan Wolfslayer ; Keisha Fournillier (email available below). General contact details of provider: https://edirc.repec.org/data/frbgvus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.