IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2601.07992.html

Fake Date Tests: Can We Trust In-sample Accuracy of LLMs in Macroeconomic Forecasting?

Author

Listed:
  • Alexander Eliseev
  • Sergei Seleznev

Abstract

Large language models (LLMs) are a type of machine learning tool that economists have started to apply in their empirical research. One such application is macroeconomic forecasting with backtesting of LLMs, even though they are trained on the same data that is used to estimate their forecasting performance. Can these in-sample accuracy results be extrapolated to the model's out-of-sample performance? To answer this question, we developed a family of prompt sensitivity tests and two members of this family, which we call the fake date tests. These tests aim to detect two types of biases in LLMs' in-sample forecasts: lookahead bias and context bias. According to the empirical results, none of the modern LLMs tested in this study passed our first test, signaling the presence of lookahead bias in their in-sample forecasts.

Suggested Citation

  • Alexander Eliseev & Sergei Seleznev, 2026. "Fake Date Tests: Can We Trust In-sample Accuracy of LLMs in Macroeconomic Forecasting?," Papers 2601.07992, arXiv.org, revised Mar 2026.
  • Handle: RePEc:arx:papers:2601.07992
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2601.07992
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Daron Acemoglu, 2025. "The simple macroeconomics of AI," Economic Policy, CEPR, CESifo, Sciences Po;CES;MSH, vol. 40(121), pages 13-58.
    2. Diebold, Francis X. & Schorfheide, Frank & Shin, Minchul, 2017. "Real-time forecast evaluation of DSGE models with stochastic volatility," Journal of Econometrics, Elsevier, vol. 201(2), pages 322-332.
    3. Alejandro Lopez-Lira & Yuehua Tang & Mingyin Zhu, 2025. "The Memorization Problem: Can We Trust LLMs' Economic Forecasts?," Papers 2504.14765, arXiv.org, revised Dec 2025.
    4. Erik Brynjolfsson & Anton Korinek & Ajay K. Agrawal, 2025. "A Research Agenda for the Economics of Transformative AI," NBER Working Papers 34256, National Bureau of Economic Research, Inc.
    5. Miguel Faria-e-Castro & Fernando Leibovici, 2024. "Artificial Intelligence and Inflation Forecasts," Review, Federal Reserve Bank of St. Louis, vol. 106(12), pages 1-14, November.
    6. David M. Ritzwoller & Joseph P. Romano & Azeem M. Shaikh, 2024. "Randomization Inference: Theory and Applications," Papers 2406.09521, arXiv.org, revised Feb 2025.
    7. Ajay K. Agrawal & Erik Brynjolfsson & Anton Korinek, 2025. "Introduction to "The Economics of Transformative AI"," NBER Chapters, in: The Economics of Transformative AI, National Bureau of Economic Research, Inc.
    8. Zarifhonarvar, Ali, 2026. "Generating inflation expectations with large language models," Journal of Monetary Economics, Elsevier, vol. 157(C).
    9. Jens Ludwig & Sendhil Mullainathan & Ashesh Rambachan, 2024. "Large Language Models: An Applied Econometric Framework," Papers 2412.07031, arXiv.org, revised Dec 2025.
    10. Anton Korinek, 2023. "Generative AI for Economic Research: Use Cases and Implications for Economists," Journal of Economic Literature, American Economic Association, vol. 61(4), pages 1281-1317, December.
    11. Domenico Giannone & Michele Lenza & Giorgio E. Primiceri, 2015. "Prior Selection for Vector Autoregressions," The Review of Economics and Statistics, MIT Press, vol. 97(2), pages 436-451, May.
    12. Sophia Kazinnik & Tara M. Sinclair, 2025. "FOMC In Silico: A Multi-Agent System for Monetary Policy Decision Modeling," Working Papers 2025-005, The George Washington University, The Center for Economic Research.
    13. Jianhao Lin & Lexuan Sun & Yixin Yan, 2025. "Simulating Macroeconomic Expectations using LLM Agents," Papers 2505.17648, arXiv.org, revised Nov 2025.
    14. Didisheim, Antoine & Fraschini, Martina & Somoza, Luciano, 2025. "AI’s predictable memory in financial analysis," Economics Letters, Elsevier, vol. 256(C).
    15. Songrun He & Linying Lv & Asaf Manela & Jimmy Wu, 2025. "Chronologically Consistent Large Language Models," Papers 2502.21206, arXiv.org, revised Jul 2025.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Alexander Eliseev & Sergei Seleznev, 2026. "Fake Date Tests: Can We Trust In-sample Accuracy of LLMs in Macroeconomic Forecasting?," Bank of Russia Working Paper Series wps167, Bank of Russia.
    2. Nikoleta Anesti & Edward Hill & Andreas Joseph, 2025. "Inflation Attitudes of Large Language Models," Papers 2512.14306, arXiv.org.
    3. Didisheim, Antoine & Fraschini, Martina & Somoza, Luciano, 2025. "AI’s predictable memory in financial analysis," Economics Letters, Elsevier, vol. 256(C).
    4. Leland D. Crane & Akhil Karra & Paul E. Soto, 2025. "Total Recall? Evaluating the Macroeconomic Knowledge of Large Language Models," Finance and Economics Discussion Series 2025-044, Board of Governors of the Federal Reserve System (U.S.).
    5. Ke Wu & Baozhong Yang & Zhenkun Ying & Dexin Zhou, 2025. "Anonymization and Information Loss," Papers 2511.15364, arXiv.org.
    6. M.Jahangir Alam & Shane Boyle & Huiyu Li & Tatevik Sekhposyan, 2026. "ChatMacro: Evaluating Inflation Forecasts of Generative AI," Working Paper Series 2026-04, Federal Reserve Bank of San Francisco.
    7. repec:rim:rimwps:18-20 is not listed on IDEAS
    8. Mehmet Caner & Agostino Capponi & Nathan Sun & Jonathan Y. Tan, 2026. "Designing Agentic AI-Based Screening for Portfolio Investment," Papers 2603.23300, arXiv.org.
    9. Zhenyu Gao & Wenxi Jiang & Yutong Yan, 2025. "A Test of Lookahead Bias in LLM Forecasts," Papers 2512.23847, arXiv.org.
    10. Mostapha Benhenda, 2026. "Look-Ahead-Bench: a Standardized Benchmark of Look-ahead Bias in Point-in-Time LLMs for Finance," Papers 2601.13770, arXiv.org.
    11. Songrun He & Linying Lv & Asaf Manela & Jimmy Wu, 2025. "Instruction Tuning Chronologically Consistent Language Models," Papers 2510.11677, arXiv.org, revised Nov 2025.
    12. Minniti, Antonio & Prettner, Klaus & Venturini, Francesco, 2025. "AI innovation and the labor share in European regions," European Economic Review, Elsevier, vol. 177(C).
    13. Stefania Albanesi & António Dias da Silva & Juan F Jimeno & Ana Lamo & Alena Wabitsch, 2025. "New technologies and jobs in Europe," Economic Policy, CEPR, CESifo, Sciences Po;CES;MSH, vol. 40(121), pages 71-139.
    14. Feyzollahi, Maryam & Rafizadeh, Nima, 2025. "The adoption of Large Language Models in economics research," Economics Letters, Elsevier, vol. 250(C).
    15. Giuseppe Matera, 2025. "Corporate Earnings Calls and Analyst Beliefs," Papers 2511.15214, arXiv.org, revised Nov 2025.
    16. Carriero, Andrea & Galvão, Ana Beatriz & Kapetanios, George, 2019. "A comprehensive evaluation of macroeconomic forecasting methods," International Journal of Forecasting, Elsevier, vol. 35(4), pages 1226-1239.
    17. Florian Misch & Ben Park & Carlo Pizzinelli & Galen Sher, 2026. "Artificial Intelligence and Productivity in Europe," CESifo Working Paper Series 12401, CESifo.
    18. Dumont, Michel & Rayp, Glenn, 2025. "Belgian start-ups in Artificial Intelligence," MPRA Paper 126994, University Library of Munich, Germany.
    19. Yan Liu & He Wang, 2024. "Who on Earth Is Using Generative AI ?," Policy Research Working Paper Series 10870, The World Bank.
    20. Liu, Yan & Wang, He, 2026. "Who on earth is using Generative AI?," World Development, Elsevier, vol. 199(C).
    21. Gary Koop & Dimitris Korobilis, 2019. "Forecasting with High‐Dimensional Panel VARs," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 81(5), pages 937-959, October.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2601.07992. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.