IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2311.07598.html
   My bibliography  Save this paper

Multi-Label Topic Model for Financial Textual Data

Author

Listed:
  • Moritz Scherrmann

Abstract

This paper presents a multi-label topic model for financial texts like ad-hoc announcements, 8-K filings, finance related news or annual reports. I train the model on a new financial multi-label database consisting of 3,044 German ad-hoc announcements that are labeled manually using 20 predefined, economically motivated topics. The best model achieves a macro F1 score of more than 85%. Translating the data results in an English version of the model with similar performance. As application of the model, I investigate differences in stock market reactions across topics. I find evidence for strong positive or negative market reactions for some topics, like announcements of new Large Scale Projects or Bankruptcy Filings, while I do not observe significant price effects for some other topics. Furthermore, in contrast to previous studies, the multi-label structure of the model allows to analyze the effects of co-occurring topics on stock market reactions. For many cases, the reaction to a specific topic depends heavily on the co-occurrence with other topics. For example, if allocated capital from a Seasoned Equity Offering (SEO) is used for restructuring a company in the course of a Bankruptcy Proceeding, the market reacts positively on average. However, if that capital is used for covering unexpected, additional costs from the development of new drugs, the SEO implies negative reactions on average.

Suggested Citation

  • Moritz Scherrmann, 2023. "Multi-Label Topic Model for Financial Textual Data," Papers 2311.07598, arXiv.org.
  • Handle: RePEc:arx:papers:2311.07598
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2311.07598
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Fama, Eugene F, 1970. "Efficient Capital Markets: A Review of Theory and Empirical Work," Journal of Finance, American Finance Association, vol. 25(2), pages 383-417, May.
    2. Pekka Malo & Ankur Sinha & Pekka Korhonen & Jyrki Wallenius & Pyry Takala, 2014. "Good debt or bad debt: Detecting semantic orientations in economic texts," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 65(4), pages 782-796, April.
    3. Jacob Boudoukh & Ronen Feldman & Shimon Kogan & Matthew Richardson, 2013. "Which News Moves Stock Prices? A Textual Analysis," NBER Working Papers 18725, National Bureau of Economic Research, Inc.
    4. Tim Loughran & Bill Mcdonald, 2011. "When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10‐Ks," Journal of Finance, American Finance Association, vol. 66(1), pages 35-65, February.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Prajwal Eachempati & Praveen Ranjan Srivastava, 2021. "Accounting for unadjusted news sentiment for asset pricing," Qualitative Research in Financial Markets, Emerald Group Publishing Limited, vol. 13(3), pages 383-422, May.
    2. Bommes, Elisabeth & Chen, Cathy Yi-Hsuan & Härdle, Wolfgang Karl, 2018. "Textual Sentiment and Sector specific reaction," IRTG 1792 Discussion Papers 2018-043, Humboldt University of Berlin, International Research Training Group 1792 "High Dimensional Nonstationary Time Series".
    3. Stefan Feuerriegel & Nicolas Prollochs, 2018. "Investor Reaction to Financial Disclosures Across Topics: An Application of Latent Dirichlet Allocation," Papers 1805.03308, arXiv.org.
    4. Kirtac, Kemal & Germano, Guido, 2024. "Sentiment trading with large language models," Finance Research Letters, Elsevier, vol. 62(PB).
    5. Chen, Cathy Yi-Hsuan & Fengler, Matthias R. & Härdle, Wolfgang Karl & Liu, Yanchu, 2022. "Media-expressed tone, option characteristics, and stock return predictability," Journal of Economic Dynamics and Control, Elsevier, vol. 134(C).
    6. Steven Heston & Nitish R. Sinha, 2016. "News versus Sentiment : Predicting Stock Returns from News Stories," Finance and Economics Discussion Series 2016-048, Board of Governors of the Federal Reserve System (U.S.).
    7. Sun, Andrew & Lachanski, Michael & Fabozzi, Frank J., 2016. "Trade the tweet: Social media text mining and sparse matrix factorization for stock market prediction," International Review of Financial Analysis, Elsevier, vol. 48(C), pages 272-281.
    8. Gustaf Bellstam & Sanjai Bhagat & J. Anthony Cookson, 2021. "A Text-Based Analysis of Corporate Innovation," Management Science, INFORMS, vol. 67(7), pages 4004-4031, July.
    9. An, Suwei, 2023. "Essays on incentive contracts, M&As, and firm risk," Other publications TiSEM dd97d2f5-1c9d-47c5-ba62-f, Tilburg University, School of Economics and Management.
    10. Duygu Ider & Stefan Lessmann, 2022. "Forecasting Cryptocurrency Returns from Sentiment Signals: An Analysis of BERT Classifiers and Weak Supervision," Papers 2204.05781, arXiv.org, revised Mar 2023.
    11. Darko B. Vuković & Senanu Dekpo-Adza & Stefana Matović, 2025. "AI integration in financial services: a systematic review of trends and regulatory challenges," Palgrave Communications, Palgrave Macmillan, vol. 12(1), pages 1-29, December.
    12. Gabriele Ranco & Ilaria Bordino & Giacomo Bormetti & Guido Caldarelli & Fabrizio Lillo & Michele Treccani, 2014. "Coupling news sentiment with web browsing data improves prediction of intra-day price dynamics," Papers 1412.3948, arXiv.org, revised Dec 2015.
    13. Vegard Høghaug Larsen & Leif Anders Thorsrud, 2022. "Asset returns, news topics, and media effects," Scandinavian Journal of Economics, Wiley Blackwell, vol. 124(3), pages 838-868, July.
    14. Barth, Andreas & Mansouri, Sasan & Wöbbeking, Fabian, 2024. "Information flow and market efficiency -- unintended side effects of the Plain Writing Act," VfS Annual Conference 2024 (Berlin): Upcoming Labor Market Challenges 302384, Verein für Socialpolitik / German Economic Association.
    15. Julian Junyan Wang & Victor Xiaoqi Wang, 2025. "Assessing Consistency and Reproducibility in the Outputs of Large Language Models: Evidence Across Diverse Finance and Accounting Tasks," Papers 2503.16974, arXiv.org, revised Mar 2025.
    16. Fang, Yiwei & Fiordelisi, Franco & Hasan, Iftekhar & Leung, Woon Sau & Wong, Gabriel, 2023. "Corporate culture and firm value: Evidence from crisis," Journal of Banking & Finance, Elsevier, vol. 146(C).
    17. Chouliaras, Andreas, 2015. "High Frequency Newswire Textual Sentiment: Evidence from international stock markets during the European Financial Crisis," MPRA Paper 62524, University Library of Munich, Germany.
    18. Alasdair Brown & Dooruj Rambaccussing & James Reade & Giambattista Rossi, 2016. "Using Social Media to Identify Market Inefficiencies: Evidence from Twitter and Betfair," Economics Discussion Papers em-dp2016-01, Department of Economics, University of Reading.
    19. Stefan Claus & Massimo Stella, 2022. "Natural Language Processing and Cognitive Networks Identify UK Insurers’ Trends in Investor Day Transcripts," Future Internet, MDPI, vol. 14(10), pages 1-18, October.
    20. Alberto Barroso Del Toro & Laura Vivas Crisol & Xavier Tort-Martorell, 2022. "The Sustainability Narrative: A Multi Study Using Event Studies to Analyse the American Energy Companies Shareholder’s Reaction to Sustainability News," IJERPH, MDPI, vol. 19(23), pages 1-17, November.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2311.07598. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.