IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2512.06620.html

Unveiling Hedge Funds: Topic Modeling and Sentiment Correlation with Fund Performance

Author

Listed:
  • Chang Liu

Abstract

The hedge fund industry presents significant challenges for investors due to its opacity and limited disclosure requirements. This pioneering study introduces two major innovations in financial text analysis. First, we apply topic modeling to hedge fund documents-an unexplored domain for automated text analysis-using a unique dataset of over 35,000 documents from 1,125 hedge fund managers. We compared three state-of-the-art methods: Latent Dirichlet Allocation (LDA), Top2Vec, and BERTopic. Our findings reveal that LDA with 20 topics produces the most interpretable results for human users and demonstrates higher robustness in topic assignments when the number of topics varies, while Top2Vec shows superior classification performance. Second, we establish a novel quantitative framework linking document sentiment to fund performance, transforming qualitative information traditionally requiring expert interpretation into systematic investment signals. In sentiment analysis, contrary to expectations, the general-purpose DistilBERT outperforms the finance-specific FinBERT in generating sentiment scores, demonstrating superior adaptability to diverse linguistic patterns found in hedge fund documents that extend beyond specialized financial news text. Furthermore, sentiment scores derived using DistilBERT in combination with Top2Vec show stronger correlations with subsequent fund performance compared to other model combinations. These results demonstrate that automated topic modeling and sentiment analysis can effectively process hedge fund documents, providing investors with new data-driven decision support tools.

Suggested Citation

  • Chang Liu, 2025. "Unveiling Hedge Funds: Topic Modeling and Sentiment Correlation with Fund Performance," Papers 2512.06620, arXiv.org.
  • Handle: RePEc:arx:papers:2512.06620
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2512.06620
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. David F. Larcker & Anastasia A. Zakolyukina, 2012. "Detecting Deceptive Discussions in Conference Calls," Journal of Accounting Research, John Wiley & Sons, Ltd., vol. 50(2), pages 495-540, May.
    2. Stephen Brown & William Goetzmann & Bing Liang & Christopher Schwarz, 2008. "Mandatory Disclosure and Operational Risk: Evidence from Hedge Fund Registration," Journal of Finance, American Finance Association, vol. 63(6), pages 2785-2815, December.
    3. Fengler, Matthias & Phan, Minh Tri, 2023. "A Topic Model for 10-K Management Disclosures," Economics Working Paper Series 2307, University of St. Gallen, School of Economics and Political Science.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yuqian Xu & Lingjiong Zhu & Michael Pinedo, 2020. "Operational Risk Management: A Stochastic Control Framework with Preventive and Corrective Controls," Operations Research, INFORMS, vol. 68(6), pages 1804-1825, November.
    2. Chang, Xiaochen & Guo, Songlin & Huang, Junkai, 2022. "Kidnapped mutual funds: Irrational preference of naive investors and fund incentive distortion," International Review of Financial Analysis, Elsevier, vol. 83(C).
    3. Jędrzej Białkowski & Huong Dieu Dang & Xiaopeng Wei, 2017. "Does the Tail Wag the Dog? Evidence from Fund Flow to VIX ETFs and ETNs," Working Papers in Economics 17/17, University of Canterbury, Department of Economics and Finance.
    4. Evangeline O. Elijido-Ten & Peter Clarkson, 2019. "Going Beyond Climate Change Risk Management: Insights from the World’s Largest Most Sustainable Corporations," Journal of Business Ethics, Springer, vol. 157(4), pages 1067-1089, July.
    5. Bali, Turan G. & Brown, Stephen J. & Caglayan, Mustafa O., 2019. "Upside potential of hedge funds as a predictor of future performance," Journal of Banking & Finance, Elsevier, vol. 98(C), pages 212-229.
    6. Zvi Singer & Jing Zhang, 2022. "Do companies try to conceal financial misstatements through auditor shopping?," Journal of Business Finance & Accounting, Wiley Blackwell, vol. 49(1-2), pages 140-180, January.
    7. Aiken, Adam L. & Clifford, Christopher P. & Ellis, Jesse A., 2015. "Hedge funds and discretionary liquidity restrictions," Journal of Financial Economics, Elsevier, vol. 116(1), pages 197-218.
    8. Li, Lu & Li, Yihang & Wang, Xueding & Xiao, Tusheng & Zhu, Hongjun, 2022. "Hedge fund networks, information dissemination, and stock price comovement: Evidence from China," International Review of Financial Analysis, Elsevier, vol. 83(C).
    9. Xue, Yi & Gençay, Ramazan, 2012. "Hierarchical information and the rate of information diffusion," Journal of Economic Dynamics and Control, Elsevier, vol. 36(9), pages 1372-1401.
    10. Daniel Barth & Juha Joenvaara & Mikko Kauppila & Russ Wermers, 2020. "The Hedge Fund Industry is Bigger (and has Performed Better) Than You Think," Working Papers 20-01, Office of Financial Research, US Department of the Treasury, revised 08 Mar 2021.
    11. Liebmann, Michael & Orlov, Alexei G. & Neumann, Dirk, 2016. "The tone of financial news and the perceptions of stock and CDS traders," International Review of Financial Analysis, Elsevier, vol. 46(C), pages 159-175.
    12. Jorion, Philippe & Schwarz, Christopher, 2014. "Are hedge fund managers systematically misreporting? Or not?," Journal of Financial Economics, Elsevier, vol. 111(2), pages 311-327.
    13. Lo, Kin & Ramos, Felipe & Rogo, Rafael, 2017. "Earnings management and annual report readability," Journal of Accounting and Economics, Elsevier, vol. 63(1), pages 1-25.
    14. Ionel Bostan & Ionela-Corina Chersan & Magdalena Danileț & Mihaela Ifrim & Viorica Chirilă, 2020. "Investigations Regarding the Linguistic Register Used by Managers to Convey to Stakeholders a Positive View of Their Company, in the Context of the Business Sustainability Desideratum," Sustainability, MDPI, vol. 12(17), pages 1-19, August.
    15. Boone, Jeff & Hao, Jie & Linthicum, Cheryl & Pham, Viet, 2024. "Impression management strategy — The relationship between accounting narrative thematic bias and financial graph distortion," The British Accounting Review, Elsevier, vol. 56(4).
    16. Todd Pezzuti & James M. Leonhardt, 2023. "What’s not to like? Negations in brand messages increase consumer engagement," Journal of the Academy of Marketing Science, Springer, vol. 51(3), pages 675-694, May.
    17. Xi Fu & Xiaoxi Wu & Zhifang Zhang, 2021. "The Information Role of Earnings Conference Call Tone: Evidence from Stock Price Crash Risk," Journal of Business Ethics, Springer, vol. 173(3), pages 643-660, October.
    18. Bannier, Christina E. & Pauls, Thomas & Walter, Andreas, 2017. "CEO-speeches and stock returns," VfS Annual Conference 2017 (Vienna): Alternative Structures for Money and Banking 168192, Verein für Socialpolitik / German Economic Association.
    19. Brown, Stephen & Goetzmann, William & Liang, Bing & Schwarz, Christopher, 2012. "Trust and delegation," Journal of Financial Economics, Elsevier, vol. 103(2), pages 221-234.
    20. Lingling Zheng & Xuemin (Sterling) Yan, 2021. "Financial Industry Affiliation and Hedge Fund Performance," Management Science, INFORMS, vol. 67(12), pages 7844-7865, December.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2512.06620. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.