IDEAS home Printed from https://ideas.repec.org/p/fip/fedfwp/102407.html

ChatMacro: Evaluating Inflation Forecasts of Generative AI

Author

Listed:
  • M.Jahangir Alam

  • Shane Boyle

  • Huiyu Li
  • Tatevik Sekhposyan

Abstract

Recent research suggests that generic large language models (LLMs) can match the accuracy of traditional methods when forecasting macroeconomic variables in pseudo out-of-sample settings generated via prompts. This paper assesses the out-of-sample forecasting accuracy of LLMs by eliciting real-time forecasts of U.S. inflation from ChatGPT. We find that out-of-sample predictions are largely inaccurate and stale, even though forecasts generated in pseudo out-of-sample environments are comparable to existing benchmarks. Our results underscore the importance of out-of-sample benchmarking for LLM predictions.

Suggested Citation

  • M.Jahangir Alam & Shane Boyle & Huiyu Li & Tatevik Sekhposyan, 2026. "ChatMacro: Evaluating Inflation Forecasts of Generative AI," Working Paper Series 2026-04, Federal Reserve Bank of San Francisco.
  • Handle: RePEc:fip:fedfwp:102407
    DOI: 10.24148/wp2026-04
    Note: PDF date: January 27, 2006.
    as

    Download full text from publisher

    File URL: https://www.frbsf.org/wp-content/uploads/wp2026-04.pdf
    File Function: PDF - view
    Download Restriction: no

    File URL: https://www.frbsf.org/research-and-insights/publications/working-papers/2026/02/chatmacro-evaluating-inflation-forecasts-generative-of-ai/
    File Function: FRBSF - view
    Download Restriction: no

    File URL: https://libkey.io/10.24148/wp2026-04?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Zarifhonarvar, Ali, 2026. "Generating inflation expectations with large language models," Journal of Monetary Economics, Elsevier, vol. 157(C).
    2. Leland D. Crane & Akhil Karra & Paul E. Soto, 2025. "Total Recall? Evaluating the Macroeconomic Knowledge of Large Language Models," Finance and Economics Discussion Series 2025-044, Board of Governors of the Federal Reserve System (U.S.).
    3. Sophia Kazinnik & Tara M. Sinclair, 2025. "FOMC In Silico: A Multi-Agent System for Monetary Policy Decision Modeling," Working Papers 2025-005, The George Washington University, The Center for Economic Research.
    4. Songrun He & Linying Lv & Asaf Manela & Jimmy Wu, 2025. "Chronologically Consistent Large Language Models," Papers 2502.21206, arXiv.org, revised Jul 2025.
    5. Croushore, Dean, 2006. "Forecasting with Real-Time Macroeconomic Data," Handbook of Economic Forecasting, in: G. Elliott & C. Granger & A. Timmermann (ed.), Handbook of Economic Forecasting, edition 1, volume 1, chapter 17, pages 961-982, Elsevier.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Alexander Eliseev & Sergei Seleznev, 2026. "Fake Date Tests: Can We Trust In-sample Accuracy of LLMs in Macroeconomic Forecasting?," Papers 2601.07992, arXiv.org, revised Mar 2026.
    2. Alexander Eliseev & Sergei Seleznev, 2026. "Fake Date Tests: Can We Trust In-sample Accuracy of LLMs in Macroeconomic Forecasting?," Bank of Russia Working Paper Series wps167, Bank of Russia.
    3. Andrea Carriero & Todd E. Clark & Massimiliano Marcellino, 2015. "Realtime nowcasting with a Bayesian mixed frequency model with stochastic volatility," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 178(4), pages 837-862, October.
    4. Warne, Anders, 2023. "DSGE model forecasting: rational expectations vs. adaptive learning," Working Paper Series 2768, European Central Bank.
    5. S. Boragan Aruoba & Francis X. Diebold, 2010. "Real-Time Macroeconomic Monitoring: Real Activity, Inflation, and Interactions," American Economic Review, American Economic Association, vol. 100(2), pages 20-24, May.
    6. Sean Cao & Wei Jiang & Hui Xu, 2026. "Seeing the Goal, Missing the Truth: Human Accountability for AI Bias," Papers 2602.09504, arXiv.org.
    7. Clements, Michael P. & Beatriz Galvão, Ana, 2010. "First announcements and real economic activity," European Economic Review, Elsevier, vol. 54(6), pages 803-817, August.
    8. Knut Are Aastveit & Karsten R. Gerdrup & Anne Sofie Jore & Leif Anders Thorsrud, 2014. "Nowcasting GDP in Real Time: A Density Combination Approach," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 32(1), pages 48-68, January.
    9. Aastveit, Knut Are & Anundsen, André K. & Herstad, Eyo I., 2019. "Residential investment and recession predictability," International Journal of Forecasting, Elsevier, vol. 35(4), pages 1790-1799.
    10. Clements Michael P., 2012. "Forecasting U.S. Output Growth with Non-Linear Models in the Presence of Data Uncertainty," Studies in Nonlinear Dynamics & Econometrics, De Gruyter, vol. 16(1), pages 1-27, January.
    11. Andrea Carriero & Todd E. Clark & Massimiliano Marcellino, 2016. "Common Drifting Volatility in Large Bayesian VARs," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(3), pages 375-390, July.
    12. Garnitz, Johanna & Lehmann, Robert & Wohlrabe, Klaus, 2019. "Forecasting GDP all over the world using leading indicators based on comprehensive survey data," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 51(54), pages 5802-5816.
    13. Nikoleta Anesti & Ana Beatriz Galvao & Silvia Miranda-Agrippino, 2018. "Uncertain Kingdom: Nowcasting GDP and its Revisions," Discussion Papers 1824, Centre for Macroeconomics (CFM).
    14. Stefan Neuwirth, 2017. "Time-varying mixed frequency forecasting: A real-time experiment," KOF Working papers 17-430, KOF Swiss Economic Institute, ETH Zurich.
    15. Davide Delle Monache & Andrea De Polis & Ivan Petrella, 2024. "Modeling and Forecasting Macroeconomic Downside Risk," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 42(3), pages 1010-1025, July.
    16. Barbara Rossi, 2021. "Forecasting in the Presence of Instabilities: How We Know Whether Models Predict Well and How to Improve Them," Journal of Economic Literature, American Economic Association, vol. 59(4), pages 1135-1190, December.
    17. Yutong Yan & Raphael Tang & Zhenyu Gao & Wenxi Jiang & Yao Lu, 2026. "DatedGPT: Preventing Lookahead Bias in Large Language Models with Time-Aware Pretraining," Papers 2603.11838, arXiv.org.
    18. Allen Yikuan Huang & Zheqi Fan, 2026. "Beyond Prompting: An Autonomous Framework for Systematic Factor Investing via Agentic AI," Papers 2603.14288, arXiv.org, revised Apr 2026.
    19. Michael Pfarrhofer, 2024. "Forecasts with Bayesian vector autoregressions under real time conditions," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 43(3), pages 771-801, April.
    20. Massimiliano Marcellino, 2008. "A linear benchmark for forecasting GDP growth and inflation?," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 27(4), pages 305-340.

    More about this item

    Keywords

    ;
    ;
    ;

    JEL classification:

    • C45 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods: Special Topics - - - Neural Networks and Related Topics
    • E31 - Macroeconomics and Monetary Economics - - Prices, Business Fluctuations, and Cycles - - - Price Level; Inflation; Deflation
    • E37 - Macroeconomics and Monetary Economics - - Prices, Business Fluctuations, and Cycles - - - Forecasting and Simulation: Models and Applications

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:fip:fedfwp:102407. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Federal Reserve Bank of San Francisco Research Library (email available below). General contact details of provider: https://edirc.repec.org/data/frbsfus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.