IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2309.03079.html
   My bibliography  Save this paper

GPT-InvestAR: Enhancing Stock Investment Strategies through Annual Report Analysis with Large Language Models

Author

Listed:
  • Udit Gupta

Abstract

Annual Reports of publicly listed companies contain vital information about their financial health which can help assess the potential impact on Stock price of the firm. These reports are comprehensive in nature, going up to, and sometimes exceeding, 100 pages. Analysing these reports is cumbersome even for a single firm, let alone the whole universe of firms that exist. Over the years, financial experts have become proficient in extracting valuable information from these documents relatively quickly. However, this requires years of practice and experience. This paper aims to simplify the process of assessing Annual Reports of all the firms by leveraging the capabilities of Large Language Models (LLMs). The insights generated by the LLM are compiled in a Quant styled dataset and augmented by historical stock price data. A Machine Learning model is then trained with LLM outputs as features. The walkforward test results show promising outperformance wrt S&P500 returns. This paper intends to provide a framework for future work in this direction. To facilitate this, the code has been released as open source.

Suggested Citation

  • Udit Gupta, 2023. "GPT-InvestAR: Enhancing Stock Investment Strategies through Annual Report Analysis with Large Language Models," Papers 2309.03079, arXiv.org.
  • Handle: RePEc:arx:papers:2309.03079
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2309.03079
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Alejandro Lopez-Lira & Yuehua Tang, 2023. "Can ChatGPT Forecast Stock Price Movements? Return Predictability and Large Language Models," Papers 2304.07619, arXiv.org, revised Sep 2024.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Dong, Mengming Michael & Stratopoulos, Theophanis C. & Wang, Victor Xiaoqi, 2024. "A scoping review of ChatGPT research in accounting and finance," International Journal of Accounting Information Systems, Elsevier, vol. 55(C).
    2. Joel R. Bock, 2024. "Generating long-horizon stock "buy" signals with a neural language model," Papers 2410.18988, arXiv.org.
    3. Deborah Miori & Constantin Petrov, 2023. "Narratives from GPT-derived Networks of News, and a link to Financial Markets Dislocations," Papers 2311.14419, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Manish Jha & Jialin Qian & Michael Weber & Baozhong Yang, 2024. "Harnessing Generative AI for Economic Insights," Papers 2410.03897, arXiv.org, revised Feb 2025.
    2. Alex Kim & Maximilian Muhn & Valeri Nikolaev, 2024. "Financial Statement Analysis with Large Language Models," Papers 2407.17866, arXiv.org, revised Feb 2025.
    3. Julian Junyan Wang & Victor Xiaoqi Wang, 2025. "Assessing Consistency and Reproducibility in the Outputs of Large Language Models: Evidence Across Diverse Finance and Accounting Tasks," Papers 2503.16974, arXiv.org, revised Mar 2025.
    4. Georgios Fatouros & Konstantinos Metaxas & John Soldatos & Dimosthenis Kyriazis, 2024. "Can Large Language Models Beat Wall Street? Unveiling the Potential of AI in Stock Selection," Papers 2401.03737, arXiv.org, revised Apr 2024.
    5. Marius Hofert, 2023. "Correlation Pitfalls with ChatGPT: Would You Fall for Them?," Risks, MDPI, vol. 11(7), pages 1-17, June.
    6. Marra de Artiñano, Ignacio & Riottini Depetris, Franco & Volpe Martincus, Christian, 2023. "Automatic Product Classification in International Trade: Machine Learning and Large Language Models," IDB Publications (Working Papers) 12962, Inter-American Development Bank.
    7. Yuqi Nie & Yaxuan Kong & Xiaowen Dong & John M. Mulvey & H. Vincent Poor & Qingsong Wen & Stefan Zohren, 2024. "A Survey of Large Language Models for Financial Applications: Progress, Prospects and Challenges," Papers 2406.11903, arXiv.org.
    8. Baptiste Lefort & Eric Benhamou & Jean-Jacques Ohana & David Saltiel & Beatrice Guez, 2024. "Optimizing Performance: How Compact Models Match or Exceed GPT's Classification Capabilities through Fine-Tuning," Papers 2409.11408, arXiv.org.
    9. Yujie Ding & Shuai Jia & Tianyi Ma & Bingcheng Mao & Xiuze Zhou & Liuliu Li & Dongming Han, 2023. "Integrating Stock Features and Global Information via Large Language Models for Enhanced Stock Return Prediction," Papers 2310.05627, arXiv.org.
    10. Manish Jha & Jialin Qian & Michael Weber & Baozhong Yang, 2024. "ChatGPT and Corporate Policies," NBER Working Papers 32161, National Bureau of Economic Research, Inc.
    11. Bauer, Michael & Huber, Daniel & Offner, Eric & Renkel, Marlene & Wilms, Ole, 2024. "Corporate green pledges," IMFS Working Paper Series 214, Goethe University Frankfurt, Institute for Monetary and Financial Stability (IMFS).
    12. Edward Li & Zhiyuan Tu & Dexin Zhou, 2024. "The Promise and Peril of Generative AI: Evidence from GPT-4 as Sell-Side Analysts," Papers 2412.01069, arXiv.org.
    13. Van Pham & Scott Cunningham, 2024. "Can Base ChatGPT be Used for Forecasting without Additional Optimization?," Papers 2404.07396, arXiv.org, revised Jul 2024.
    14. Junwei Su & Shan Wu & Jinhui Li, 2024. "MTRGL:Effective Temporal Correlation Discerning through Multi-modal Temporal Relational Graph Learning," Papers 2401.14199, arXiv.org, revised Feb 2024.
    15. Francisco Peñaranda & Enrique Sentana, 2024. "Portfolio management with big data," Working Papers wp2024_2411, CEMFI.
    16. Liping Wang & Jiawei Li & Lifan Zhao & Zhizhuo Kou & Xiaohan Wang & Xinyi Zhu & Hao Wang & Yanyan Shen & Lei Chen, 2023. "Methods for Acquiring and Incorporating Knowledge into Stock Price Prediction: A Survey," Papers 2308.04947, arXiv.org.
    17. Alex Kim & Maximilian Muhn & Valeri Nikolaev, 2023. "From Transcripts to Insights: Uncovering Corporate Risks Using Generative AI," Papers 2310.17721, arXiv.org, revised Mar 2025.
    18. Baptiste Lefort & Eric Benhamou & Jean-Jacques Ohana & David Saltiel & Beatrice Guez & Damien Challet, 2024. "Can ChatGPT Compute Trustworthy Sentiment Scores from Bloomberg Market Wraps?," Papers 2401.05447, arXiv.org.
    19. Jaskaran Singh Walia & Aarush Sinha & Srinitish Srinivasan & Srihari Unnikrishnan, 2025. "Predicting Liquidity-Aware Bond Yields using Causal GANs and Deep Reinforcement Learning with LLM Evaluation," Papers 2502.17011, arXiv.org.
    20. Hanshuang Tong & Jun Li & Ning Wu & Ming Gong & Dongmei Zhang & Qi Zhang, 2024. "Ploutos: Towards interpretable stock movement prediction with financial large language model," Papers 2403.00782, arXiv.org.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2309.03079. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.