IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2512.19484.html

Structured Event Representation and Stock Return Predictability

Author

Listed:
  • Gang Li
  • Dandan Qiao
  • Mingxuan Zheng

Abstract

We find that event features extracted by large language models (LLMs) are effective for text-based stock return prediction. Using a pre-trained LLM to extract event features from news articles, we propose a novel deep learning model based on structured event representation (SER) and attention mechanisms to predict stock returns in the cross-section. Our SER-based model provides superior performance compared with other existing text-driven models to forecast stock returns out of sample and offers highly interpretable feature structures to examine the mechanisms underlying the stock return predictability. We further provide various implications based on SER and highlight the crucial benefit of structured model inputs in stock return predictability.

Suggested Citation

  • Gang Li & Dandan Qiao & Mingxuan Zheng, 2025. "Structured Event Representation and Stock Return Predictability," Papers 2512.19484, arXiv.org.
  • Handle: RePEc:arx:papers:2512.19484
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2512.19484
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Whitney Newey & Kenneth West, 2014. "A simple, positive semi-definite, heteroscedasticity and autocorrelation consistent covariance matrix," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 33(1), pages 125-132.
    2. Zacharias Sautner & Laurence Van Lent & Grigory Vilkov & Ruishen Zhang, 2023. "Firm‐Level Climate Change Exposure," Journal of Finance, American Finance Association, vol. 78(3), pages 1449-1498, June.
    3. Paul C. Tetlock & Maytal Saar‐Tsechansky & Sofus Macskassy, 2008. "More Than Words: Quantifying Language to Measure Firms' Fundamentals," Journal of Finance, American Finance Association, vol. 63(3), pages 1437-1467, June.
    4. Alex Chinco & Adam D. Clark‐Joseph & Mao Ye, 2019. "Sparse Signals in the Cross‐Section of Returns," Journal of Finance, American Finance Association, vol. 74(1), pages 449-492, February.
    5. Tarek A. Hassan & Stephan Hollander & Aakash Kalyani & Markus Schwedeler & Ahmed Tahoun & Laurence van Lent, 2024. "Text as Data in Economic Analysis," Working Papers 2024-022, Federal Reserve Bank of St. Louis, revised 11 Sep 2025.
    6. Federico Siano, 2025. "The News in Earnings Announcement Disclosures: Capturing Word Context Using LLM Methods," Management Science, INFORMS, vol. 71(11), pages 9831-9855, November.
    7. Yang Zhou & Jianqing Fan & Lirong Xue, 2024. "How Much Can Machines Learn Finance from Chinese Text Data?," Management Science, INFORMS, vol. 70(12), pages 8962-8987, December.
    8. Lauren Cohen & Andrea Frazzini, 2008. "Economic Links and Predictable Returns," Journal of Finance, American Finance Association, vol. 63(4), pages 1977-2011, August.
    9. Goldman, Eitan & Gupta, Nandini & Israelsen, Ryan, 2024. "Political polarization in financial news," Journal of Financial Economics, Elsevier, vol. 155(C).
    10. David Hirshleifer & Dat Mai & Kuntara Pukthuanthong, 2025. "War Discourse and Disaster Premium: 160 Years of Evidence from the Stock Market," The Review of Financial Studies, Society for Financial Studies, vol. 38(2), pages 457-506.
    11. Hassan, Tarek A. & Hollander, Stephan & Kalyani, Aakash & van Lent, Laurence & Schwedeler, Markus & Tahoun, Ahmed, 2025. "Text as data in economic analysis," Other publications TiSEM 8f7d8978-0e88-4f8c-80f4-4, Tilburg University, School of Economics and Management.
    12. Huang, Shiyang & Huang, Yulin & Lin, Tse-Chun, 2019. "Attention allocation and return co-movement: Evidence from repeated natural experiments," Journal of Financial Economics, Elsevier, vol. 132(2), pages 369-383.
    13. Tarek A Hassan & Stephan Hollander & Laurence van Lent & Ahmed Tahoun, 2019. "Firm-Level Political Risk: Measurement and Effects," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 134(4), pages 2135-2202.
    14. Jegadeesh, Narasimhan & Wu, Di, 2013. "Word power: A new approach for content analysis," Journal of Financial Economics, Elsevier, vol. 110(3), pages 712-729.
    15. Manish Jha & Jialin Qian & Michael Weber & Baozhong Yang, 2024. "ChatGPT and Corporate Policies," Papers 2409.17933, arXiv.org, revised Feb 2025.
    16. Manela, Asaf & Moreira, Alan, 2017. "News implied volatility and disaster concerns," Journal of Financial Economics, Elsevier, vol. 123(1), pages 137-162.
    17. Paul C. Tetlock, 2007. "Giving Content to Investor Sentiment: The Role of Media in the Stock Market," Journal of Finance, American Finance Association, vol. 62(3), pages 1139-1168, June.
    18. Tarek A. Hassan & Stephan Hollander & Aakash Kalyani & Laurence van Lent & Markus Schwedeler & Ahmed Tahoun, 2025. "Text as Data in Economic Analysis," Journal of Economic Perspectives, American Economic Association, vol. 39(3), pages 193-220, Summer.
    19. Leland Bybee & Bryan Kelly & Yinan Su & Tarun Ramadorai, 2023. "Narrative Asset Pricing: Interpretable Systematic Risk Factors from News Text," The Review of Financial Studies, Society for Financial Studies, vol. 36(12), pages 4759-4787.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Chen, Zilin & Guo, Li & Tu, Jun, 2021. "Media connection and return comovement," Journal of Economic Dynamics and Control, Elsevier, vol. 130(C).
    2. Jian Chen & Guohao Tang & Guofu Zhou & Wu Zhu, 2025. "ChatGPT and Deepseek: Can They Predict the Stock Market and Macroeconomy?," Papers 2502.10008, arXiv.org.
    3. Karsten Müller, 2022. "German forecasters’ narratives: How informative are German business cycle forecast reports?," Empirical Economics, Springer, vol. 62(5), pages 2373-2415, May.
    4. Jacobs, Heiko & Lauber, Alexander & Müller, Sebastian, 2025. "Bearish bets and the press: On the relation between short interest and media tone," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 104(C).
    5. García, Diego & Hu, Xiaowen & Rohrer, Maximilian, 2023. "The colour of finance words," Journal of Financial Economics, Elsevier, vol. 147(3), pages 525-549.
    6. Su, Zhi & Lu, Man & Yin, Libo, 2018. "Oil prices and news-based uncertainty: Novel evidence," Energy Economics, Elsevier, vol. 72(C), pages 331-340.
    7. Jin, Xuejun & Chen, Cheng & Yang, Xiaolan, 2024. "The effect of international media news on the global stock market," International Review of Economics & Finance, Elsevier, vol. 89(PA), pages 50-69.
    8. Li Guo & Lin Peng & Yubo Tao & Jun Tu, 2017. "Joint News, Attention Spillover,and Stock Returns," Papers 1703.02715, arXiv.org, revised Jul 2025.
    9. Zheng Tracy Ke & Bryan T. Kelly & Dacheng Xiu, 2019. "Predicting Returns With Text Data," NBER Working Papers 26186, National Bureau of Economic Research, Inc.
    10. Tom Marty & Bruce Vanstone & Tobias Hahn, 2020. "News media analytics in finance: a survey," Accounting and Finance, Accounting and Finance Association of Australia and New Zealand, vol. 60(2), pages 1385-1434, June.
    11. Giuseppe Matera, 2025. "Corporate Earnings Calls and Analyst Beliefs," Papers 2511.15214, arXiv.org, revised Nov 2025.
    12. Pan, Zhiyuan & Zhong, Hao & Wang, Yudong & Huang, Juan, 2024. "Forecasting oil futures returns with news," Energy Economics, Elsevier, vol. 134(C).
    13. Andres Algaba & David Ardia & Keven Bluteau & Samuel Borms & Kris Boudt, 2020. "Econometrics Meets Sentiment: An Overview Of Methodology And Applications," Journal of Economic Surveys, Wiley Blackwell, vol. 34(3), pages 512-547, July.
    14. Pereira, Camila C. & Bastos, Saulo B. & Cajueiro, Daniel O., 2025. "The words that lead to uncertainty: A measure based on word embeddings," Economic Systems, Elsevier, vol. 49(3).
    15. Qiwen Sheng & Tomislav Vukina, 2024. "Public Communication as a Mechanism for Collusion in the Broiler Industry," Review of Industrial Organization, Springer;The Industrial Organization Society, vol. 64(1), pages 57-91, February.
    16. Liu, Sha & Han, Jingguang, 2020. "Media tone and expected stock returns," International Review of Financial Analysis, Elsevier, vol. 70(C).
    17. Yuan, Kaibin & Liang, Yuheng & Zhu, Mengnan, 2024. "Social forecasting: Online social opinion and the cross-section of stock returns," Pacific-Basin Finance Journal, Elsevier, vol. 86(C).
    18. Gupta, Rangan & Kollias, Christos & Papadamou, Stephanos & Wohar, Mark E., 2018. "News implied volatility and the stock-bond nexus: Evidence from historical data for the USA and the UK markets," Journal of Multinational Financial Management, Elsevier, vol. 47, pages 76-90.
    19. Kemal Kirtac & Guido Germano, 2025. "Large language models in finance : what is financial sentiment?," Papers 2503.03612, arXiv.org, revised Mar 2025.
    20. Anna Scherbina & Bernd Schlusche, 2016. "Economic linkages inferred from news stories and the predictability of stock returns," AEI Economics Working Papers 873600, American Enterprise Institute.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2512.19484. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.