IDEAS home Printed from https://ideas.repec.org/p/bkr/wpaper/wps167.html

Fake Date Tests: Can We Trust In-sample Accuracy of LLMs in Macroeconomic Forecasting?

Author

Listed:
  • Alexander Eliseev

    (Bank of Russia, Russian Federation)

  • Sergei Seleznev

    (Bank of Russia, Russian Federation)

Abstract

Large language models (LLMs) are a type of machine learning tool that economists have started to apply in their empirical research. One such application is macroeconomic forecasting with backtesting of LLMs, even though they are trained on the same data that is used to estimate their forecasting performance. Can these in-sample accuracy results be extrapolated to the model’s out-of-sample performance? To answer this question, we developed a family of prompt sensitivity tests and two members of this family, which we call the fake date tests. These tests aim to detect two types of biases in LLMs’ in-sample forecasts: lookahead bias and context bias. According to the empirical results, none of the modern LLMs tested in this study passed our tests, signaling the presence of biases in their in-sample forecasts.

Suggested Citation

  • Alexander Eliseev & Sergei Seleznev, 2026. "Fake Date Tests: Can We Trust In-sample Accuracy of LLMs in Macroeconomic Forecasting?," Bank of Russia Working Paper Series wps167, Bank of Russia.
  • Handle: RePEc:bkr:wpaper:wps167
    as

    Download full text from publisher

    File URL: http://www.cbr.ru/StaticHtml/File/188046/wp_167.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Daron Acemoglu, 2025. "The simple macroeconomics of AI," Economic Policy, CEPR, CESifo, Sciences Po;CES;MSH, vol. 40(121), pages 13-58.
    2. Diebold, Francis X. & Schorfheide, Frank & Shin, Minchul, 2017. "Real-time forecast evaluation of DSGE models with stochastic volatility," Journal of Econometrics, Elsevier, vol. 201(2), pages 322-332.
    3. Anton Korinek, 2025. "AI Agents for Economic Research," NBER Working Papers 34202, National Bureau of Economic Research, Inc.
    4. Jens Ludwig & Sendhil Mullainathan & Ashesh Rambachan, 2024. "Large Language Models: An Applied Econometric Framework," Papers 2412.07031, arXiv.org, revised Dec 2025.
    5. Anton Korinek, 2023. "Generative AI for Economic Research: Use Cases and Implications for Economists," Journal of Economic Literature, American Economic Association, vol. 61(4), pages 1281-1317, December.
    6. Alejandro Lopez-Lira & Yuehua Tang, 2023. "Can ChatGPT Forecast Stock Price Movements? Return Predictability and Large Language Models," Papers 2304.07619, arXiv.org, revised Oct 2025.
    7. Sophia Kazinnik & Tara M. Sinclair, 2025. "FOMC In Silico: A Multi-Agent System for Monetary Policy Decision Modeling," Working Papers 2025-005, The George Washington University, The Center for Economic Research.
    8. Songrun He & Linying Lv & Asaf Manela & Jimmy Wu, 2025. "Chronologically Consistent Large Language Models," Papers 2502.21206, arXiv.org, revised Jul 2025.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Alexander Eliseev & Sergei Seleznev, 2026. "Fake Date Tests: Can We Trust In-sample Accuracy of LLMs in Macroeconomic Forecasting?," Papers 2601.07992, arXiv.org, revised Mar 2026.
    2. Nikoleta Anesti & Edward Hill & Andreas Joseph, 2025. "Inflation Attitudes of Large Language Models," Papers 2512.14306, arXiv.org.
    3. Iñaki Aldasoro & Ajit Desai, 2025. "Money Talks: AI Agents for Cash Management in Payment Systems," Staff Working Papers 25-35, Bank of Canada.
    4. Iñaki Aldasoro & Ajit Desai, 2025. "AI agents for cash management in payment systems," BIS Working Papers 1310, Bank for International Settlements.
    5. Manish Jha & Jialin Qian & Michael Weber & Baozhong Yang, 2024. "Generative AI, Managerial Expectations, and Economic Activity," Papers 2410.03897, arXiv.org, revised Nov 2025.
    6. Songrun He & Linying Lv & Asaf Manela & Jimmy Wu, 2025. "Instruction Tuning Chronologically Consistent Language Models," Papers 2510.11677, arXiv.org, revised Nov 2025.
    7. Michael Bauer & Daniel Huber & Eric Offner & Marlene Renkel & Ole Wilms & Michael D. Bauer, 2024. "Corporate Green Pledges," CESifo Working Paper Series 11507, CESifo.
    8. Didisheim, Antoine & Fraschini, Martina & Somoza, Luciano, 2025. "AI’s predictable memory in financial analysis," Economics Letters, Elsevier, vol. 256(C).
    9. Minniti, Antonio & Prettner, Klaus & Venturini, Francesco, 2025. "AI innovation and the labor share in European regions," European Economic Review, Elsevier, vol. 177(C).
    10. James Bono & Beibei Cheng & Joaquin Lozano, 2025. "Randomized Controlled Trials for Conditional Access Optimization Agent," Papers 2511.13865, arXiv.org.
    11. Stefania Albanesi & António Dias da Silva & Juan F Jimeno & Ana Lamo & Alena Wabitsch, 2025. "New technologies and jobs in Europe," Economic Policy, CEPR, CESifo, Sciences Po;CES;MSH, vol. 40(121), pages 71-139.
    12. Feyzollahi, Maryam & Rafizadeh, Nima, 2025. "The adoption of Large Language Models in economics research," Economics Letters, Elsevier, vol. 250(C).
    13. Giuseppe Matera, 2025. "Corporate Earnings Calls and Analyst Beliefs," Papers 2511.15214, arXiv.org, revised Nov 2025.
    14. Florian Misch & Ben Park & Carlo Pizzinelli & Galen Sher, 2026. "Artificial Intelligence and Productivity in Europe," CESifo Working Paper Series 12401, CESifo.
    15. Yan Liu & He Wang, 2024. "Who on Earth Is Using Generative AI ?," Policy Research Working Paper Series 10870, The World Bank.
    16. Herbert Dawid & Philipp Harting & Hankui Wang & Zhongli Wang & Jiachen Yi, 2025. "Agentic Workflows for Economic Research: Design and Implementation," Papers 2504.09736, arXiv.org.
    17. Liu, Yan & Wang, He, 2026. "Who on earth is using Generative AI?," World Development, Elsevier, vol. 199(C).
    18. James Bono, 2025. "Randomized Controlled Trials for Phishing Triage Agent," Papers 2511.13860, arXiv.org.
    19. Ke Wu & Baozhong Yang & Zhenkun Ying & Dexin Zhou, 2025. "Anonymization and Information Loss," Papers 2511.15364, arXiv.org.
    20. M.Jahangir Alam & Shane Boyle & Huiyu Li & Tatevik Sekhposyan, 2026. "ChatMacro: Evaluating Inflation Forecasts of Generative AI," Working Paper Series 2026-04, Federal Reserve Bank of San Francisco.

    More about this item

    Keywords

    ;
    ;
    ;
    ;

    JEL classification:

    • C12 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Hypothesis Testing: General
    • C52 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Model Evaluation, Validation, and Selection
    • C53 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Forecasting and Prediction Models; Simulation Methods

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bkr:wpaper:wps167. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: BoR Research The email address of this maintainer does not seem to be valid anymore. Please ask BoR Research to update the entry or send us the correct address (email available below). General contact details of provider: https://edirc.repec.org/data/cbrgvru.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.