IDEAS home Printed from https://ideas.repec.org/a/bla/jinfst/v73y2022i9p1314-1335.html
   My bibliography  Save this article

SEntFiN 1.0: Entity‐aware sentiment analysis for financial news

Author

Listed:
  • Ankur Sinha
  • Satishwar Kedas
  • Rishu Kumar
  • Pekka Malo

Abstract

Fine‐grained financial sentiment analysis on news headlines is a challenging task requiring human‐annotated datasets to achieve high performance. Limited studies have tried to address the sentiment extraction task in a setting where multiple entities are present in a news headline. In an effort to further research in this area, we make publicly available SEntFiN 1.0, a human‐annotated dataset of 10,753 news headlines with entity‐sentiment annotations, of which 2,847 headlines contain multiple entities, often with conflicting sentiments. We augment our dataset with a database of over 1,000 financial entities and their various representations in news media amounting to over 5,000 phrases. We propose a framework that enables the extraction of entity‐relevant sentiments using a feature‐based approach rather than an expression‐based approach. For sentiment extraction, we utilize 12 different learning schemes utilizing lexicon‐based and pretrained sentence representations and five classification approaches. Our experiments indicate that lexicon‐based N‐gram ensembles are above par with pretrained word embedding schemes such as GloVe. Overall, RoBERTa and finBERT (domain‐specific BERT) achieve the highest average accuracy of 94.29% and F1‐score of 93.27%. Further, using over 210,000 entity‐sentiment predictions, we validate the economic effect of sentiments on aggregate market movements over a long duration.

Suggested Citation

  • Ankur Sinha & Satishwar Kedas & Rishu Kumar & Pekka Malo, 2022. "SEntFiN 1.0: Entity‐aware sentiment analysis for financial news," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 73(9), pages 1314-1335, September.
  • Handle: RePEc:bla:jinfst:v:73:y:2022:i:9:p:1314-1335
    DOI: 10.1002/asi.24634
    as

    Download full text from publisher

    File URL: https://doi.org/10.1002/asi.24634
    Download Restriction: no

    File URL: https://libkey.io/10.1002/asi.24634?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Warner, Jerold B. & Watts, Ross L. & Wruck, Karen H., 1988. "Stock prices and top management changes," Journal of Financial Economics, Elsevier, vol. 20(1-2), pages 461-492, January.
    2. Paul C. Tetlock, 2007. "Giving Content to Investor Sentiment: The Role of Media in the Stock Market," Journal of Finance, American Finance Association, vol. 62(3), pages 1139-1168, June.
    3. Werner Antweiler & Murray Z. Frank, 2004. "Is All That Talk Just Noise? The Information Content of Internet Stock Message Boards," Journal of Finance, American Finance Association, vol. 59(3), pages 1259-1294, June.
    4. Kearney, Colm & Liu, Sha, 2014. "Textual sentiment in finance: A survey of methods and models," International Review of Financial Analysis, Elsevier, vol. 33(C), pages 171-185.
    5. Chambers, Ae & Penman, Sh, 1984. "Timeliness Of Reporting And The Stock-Price Reaction To Earnings Announcements," Journal of Accounting Research, Wiley Blackwell, vol. 22(1), pages 21-47.
    6. Tim Loughran & Bill Mcdonald, 2011. "When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10‐Ks," Journal of Finance, American Finance Association, vol. 66(1), pages 35-65, February.
    7. Gabriele Ranco & Darko Aleksovski & Guido Caldarelli & Miha Grčar & Igor Mozetič, 2015. "The Effects of Twitter Sentiment on Stock Price Returns," PLOS ONE, Public Library of Science, vol. 10(9), pages 1-21, September.
    8. Duo Qin, 2011. "Rise Of Var Modelling Approach," Journal of Economic Surveys, Wiley Blackwell, vol. 25(1), pages 156-174, February.
    9. Pekka Malo & Ankur Sinha & Pekka Korhonen & Jyrki Wallenius & Pyry Takala, 2014. "Good debt or bad debt: Detecting semantic orientations in economic texts," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 65(4), pages 782-796, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Costola, Michele & Hinz, Oliver & Nofer, Michael & Pelizzon, Loriana, 2023. "Machine learning sentiment analysis, COVID-19 news and stock market reactions," Research in International Business and Finance, Elsevier, vol. 64(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Sinha, Ankur & Kedas, Satishwar & Kumar, Rishu & Malo, Pekka, 2019. "Buy, Sell or Hold: Entity-Aware Classification of Business News," IIMA Working Papers WP 2019-04-02, Indian Institute of Management Ahmedabad, Research and Publication Department.
    2. Ingrid E. Fisher & Margaret R. Garnsey & Mark E. Hughes, 2016. "Natural Language Processing in Accounting, Auditing and Finance: A Synthesis of the Literature with a Roadmap for Future Research," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 23(3), pages 157-214, July.
    3. Renault, Thomas, 2017. "Intraday online investor sentiment and return patterns in the U.S. stock market," Journal of Banking & Finance, Elsevier, vol. 84(C), pages 25-40.
    4. Andres Algaba & David Ardia & Keven Bluteau & Samuel Borms & Kris Boudt, 2020. "Econometrics Meets Sentiment: An Overview Of Methodology And Applications," Journal of Economic Surveys, Wiley Blackwell, vol. 34(3), pages 512-547, July.
    5. Chen, Cathy Yi-Hsuan & Fengler, Matthias R. & Härdle, Wolfgang Karl & Liu, Yanchu, 2022. "Media-expressed tone, option characteristics, and stock return predictability," Journal of Economic Dynamics and Control, Elsevier, vol. 134(C).
    6. Ahmed, Yousry & Elshandidy, Tamer, 2016. "The effect of bidder conservatism on M&A decisions: Text-based evidence from US 10-K filings," International Review of Financial Analysis, Elsevier, vol. 46(C), pages 176-190.
    7. Thomas Renault, 2020. "Sentiment analysis and machine learning in finance: a comparison of methods and models on one million messages," Digital Finance, Springer, vol. 2(1), pages 1-13, September.
    8. Ahmad, Khurshid & Han, JingGuang & Hutson, Elaine & Kearney, Colm & Liu, Sha, 2016. "Media-expressed negative tone and firm-level stock returns," Journal of Corporate Finance, Elsevier, vol. 37(C), pages 152-172.
    9. Vegard Høghaug Larsen & Leif Anders Thorsrud, 2022. "Asset returns, news topics, and media effects," Scandinavian Journal of Economics, Wiley Blackwell, vol. 124(3), pages 838-868, July.
    10. Maciej Wujec, 2021. "Analysis of the Financial Information Contained in the Texts of Current Reports: A Deep Learning Approach," JRFM, MDPI, vol. 14(12), pages 1-17, December.
    11. Loughran, Tim & McDonald, Bill & Pragidis, Ioannis, 2019. "Assimilation of oil news into prices," International Review of Financial Analysis, Elsevier, vol. 63(C), pages 105-118.
    12. Anand, Abhinav & Basu, Sankarshan & Pathak, Jalaj & Thampy, Ashok, 2021. "The impact of sentiment on emerging stock markets," International Review of Economics & Finance, Elsevier, vol. 75(C), pages 161-177.
    13. Fiordelisi, Franco & Ricci, Ornella, 2014. "Corporate culture and CEO turnover," Journal of Corporate Finance, Elsevier, vol. 28(C), pages 66-82.
    14. Ioanna Kountouri & Eleftherios Manousakis & Andrianos E. Tsekrekos, 2019. "Latent semantic analysis of corporate social responsibility reports (with an application to Hellenic firms)," International Journal of Disclosure and Governance, Palgrave Macmillan, vol. 16(1), pages 1-19, March.
    15. Christina Bannier & Thomas Pauls & Andreas Walter, 2019. "Content analysis of business communication: introducing a German dictionary," Journal of Business Economics, Springer, vol. 89(1), pages 79-123, February.
    16. Yi-Hsuan Chen, Cathy & Fengler, Matthias & Härdle, Wolfgang Karl & Liu, Yanchu, 2018. "Textual Sentiment, Option Characteristics, and Stock Return Predictability," Economics Working Paper Series 1808, University of St. Gallen, School of Economics and Political Science.
    17. Jozef Barunik & Cathy Yi-Hsuan Chen & Jan Vecer, 2019. "Sentiment-Driven Stochastic Volatility Model: A High-Frequency Textual Tool for Economists," Papers 1906.00059, arXiv.org.
    18. Soumya Mukhopadhyay, 2018. "Opinion mining in management research: the state of the art and the way forward," OPSEARCH, Springer;Operational Research Society of India, vol. 55(2), pages 221-250, June.
    19. Tom Marty & Bruce Vanstone & Tobias Hahn, 2020. "News media analytics in finance: a survey," Accounting and Finance, Accounting and Finance Association of Australia and New Zealand, vol. 60(2), pages 1385-1434, June.
    20. Enwei Zhu & Jing Wu & Hongyu Liu & Keyang Li, 2023. "A Sentiment Index of the Housing Market in China: Text Mining of Narratives on Social Media," The Journal of Real Estate Finance and Economics, Springer, vol. 66(1), pages 77-118, January.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jinfst:v:73:y:2022:i:9:p:1314-1335. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.asis.org .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.