IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2510.11677.html
   My bibliography  Save this paper

Instruction Tuning Chronologically Consistent Language Models

Author

Listed:
  • Songrun He
  • Linying Lv
  • Asaf Manela
  • Jimmy Wu

Abstract

We introduce a family of chronologically consistent, instruction-tuned large language models to eliminate lookahead bias. Each model is trained only on data available before a clearly defined knowledge-cutoff date, ensuring strict temporal separation from any post-cutoff data. The resulting framework offers (i) a simple, conversational chat interface, (ii) fully open, fixed model weights that guarantee replicability, and (iii) a conservative lower bound on forecast accuracy, isolating the share of predictability that survives once training leakage is removed. Together, these features provide researchers with an easy-to-use generative AI tool useful for a wide range of prediction tasks that is free of lookahead bias.

Suggested Citation

  • Songrun He & Linying Lv & Asaf Manela & Jimmy Wu, 2025. "Instruction Tuning Chronologically Consistent Language Models," Papers 2510.11677, arXiv.org, revised Nov 2025.
  • Handle: RePEc:arx:papers:2510.11677
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2510.11677
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Jens Ludwig & Sendhil Mullainathan & Ashesh Rambachan, 2024. "Large Language Models: An Applied Econometric Framework," Papers 2412.07031, arXiv.org, revised Jan 2025.
    2. Paul Glasserman & Caden Lin, 2023. "Assessing Look-Ahead Bias in Stock Return Predictions Generated By GPT Sentiment Analysis," Papers 2309.17322, arXiv.org.
    3. Songrun He & Linying Lv & Asaf Manela & Jimmy Wu, 2025. "Chronologically Consistent Large Language Models," Papers 2502.21206, arXiv.org, revised Jul 2025.
    4. Manish Jha & Jialin Qian & Michael Weber & Baozhong Yang, 2024. "ChatGPT and Corporate Policies," NBER Working Papers 32161, National Bureau of Economic Research, Inc.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hui Chen & Antoine Didisheim & Luciano Somoza & Hanqing Tian, 2025. "A Financial Brain Scan of the LLM," Papers 2508.21285, arXiv.org.
    2. Alejandro Lopez-Lira & Yuehua Tang & Mingyin Zhu, 2025. "The Memorization Problem: Can We Trust LLMs' Economic Forecasts?," Papers 2504.14765, arXiv.org.
    3. Songrun He & Linying Lv & Asaf Manela & Jimmy Wu, 2025. "Chronologically Consistent Large Language Models," Papers 2502.21206, arXiv.org, revised Jul 2025.
    4. Liyuan Chen & Shuoling Liu & Jiangpeng Yan & Xiaoyu Wang & Henglin Liu & Chuang Li & Kecheng Jiao & Jixuan Ying & Yang Veronica Liu & Qiang Yang & Xiu Li, 2025. "Advancing Financial Engineering with Foundation Models: Progress, Applications, and Challenges," Papers 2507.18577, arXiv.org.
    5. Can Celebi & Stefan Penczynski, 2024. "Using Large Language Models for Text Classification in Experimental Economics," Working Paper series, University of East Anglia, Centre for Behavioural and Experimental Social Science (CBESS) 24-01, School of Economics, University of East Anglia, Norwich, UK..
    6. Alex Kim & Maximilian Muhn & Valeri Nikolaev, 2024. "Financial Statement Analysis with Large Language Models," Papers 2407.17866, arXiv.org, revised Feb 2025.
    7. Julian Junyan Wang & Victor Xiaoqi Wang, 2025. "Assessing Consistency and Reproducibility in the Outputs of Large Language Models: Evidence Across Diverse Finance and Accounting Tasks," Papers 2503.16974, arXiv.org, revised Sep 2025.
    8. Shuaiyu Chen & T. Clifton Green & Huseyin Gulen & Dexin Zhou, 2024. "What Does ChatGPT Make of Historical Stock Returns? Extrapolation and Miscalibration in LLM Stock Return Forecasts," Papers 2409.11540, arXiv.org.
    9. Breitung, Christian & Müller, Sebastian, 2025. "Global Business Networks," Journal of Financial Economics, Elsevier, vol. 166(C).
    10. Wu, Qinqin & Zhuang, Qinqin & Liu, Yitong & Han, Longyan, 2024. "Technology shock of ChatGPT, social attention and firm value: Evidence from China," Technology in Society, Elsevier, vol. 79(C).
    11. Dong, Mengming Michael & Stratopoulos, Theophanis C. & Wang, Victor Xiaoqi, 2024. "A scoping review of ChatGPT research in accounting and finance," International Journal of Accounting Information Systems, Elsevier, vol. 55(C).
    12. Feyzollahi, Maryam & Rafizadeh, Nima, 2025. "The adoption of Large Language Models in economics research," Economics Letters, Elsevier, vol. 250(C).
    13. Leland D. Crane & Akhil Karra & Paul E. Soto, 2025. "Total Recall? Evaluating the Macroeconomic Knowledge of Large Language Models," Finance and Economics Discussion Series 2025-044, Board of Governors of the Federal Reserve System (U.S.).
    14. Yan Liu & He Wang, 2024. "Who on Earth Is Using Generative AI ?," Policy Research Working Paper Series 10870, The World Bank.
    15. Shimamura, Takuya & Tanaka, Yoshitaka & Managi, Shunsuke, 2025. "Evaluating the impact of report readability on ESG scores: A generative AI approach," International Review of Financial Analysis, Elsevier, vol. 101(C).
    16. Herbert Dawid & Philipp Harting & Hankui Wang & Zhongli Wang & Jiachen Yi, 2025. "Agentic Workflows for Economic Research: Design and Implementation," Papers 2504.09736, arXiv.org.
    17. Philippe Goulet Coulombe, 2025. "Ordinary Least Squares as an Attention Mechanism," Papers 2504.09663, arXiv.org.
    18. Yikai Zhao & Jun Nagayasu & Xinyi Geng, 2024. "Measuring Climate Policy Uncertainty with LLMs: New Insights into Corporate Bond Credit Spreads," DSSR Discussion Papers 143, Graduate School of Economics and Management, Tohoku University.
    19. Wo Long & Wenxin Zeng & Xiaoyu Zhang & Ziyao Zhou, 2025. "Integrating Large Language Models and Reinforcement Learning for Sentiment-Driven Quantitative Trading," Papers 2510.10526, arXiv.org.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2510.11677. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.