IDEAS home Printed from https://ideas.repec.org/a/spr/fininn/v6y2020i1d10.1186_s40854-020-00205-1.html
   My bibliography  Save this article

Comprehensive review of text-mining applications in finance

Author

Listed:
  • Aaryan Gupta

    (Nirma University)

  • Vinya Dengre

    (Nirma University)

  • Hamza Abubakar Kheruwala

    (Nirma University)

  • Manan Shah

    (Pandit Deendayal Petroleum University)

Abstract

Text-mining technologies have substantially affected financial industries. As the data in every sector of finance have grown immensely, text mining has emerged as an important field of research in the domain of finance. Therefore, reviewing the recent literature on text-mining applications in finance can be useful for identifying areas for further research. This paper focuses on the text-mining literature related to financial forecasting, banking, and corporate finance. It also analyses the existing literature on text mining in financial applications and provides a summary of some recent studies. Finally, the paper briefly discusses various text-mining methods being applied in the financial domain, the challenges faced in these applications, and the future scope of text mining in finance.

Suggested Citation

  • Aaryan Gupta & Vinya Dengre & Hamza Abubakar Kheruwala & Manan Shah, 2020. "Comprehensive review of text-mining applications in finance," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 6(1), pages 1-25, December.
  • Handle: RePEc:spr:fininn:v:6:y:2020:i:1:d:10.1186_s40854-020-00205-1
    DOI: 10.1186/s40854-020-00205-1
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1186/s40854-020-00205-1
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1186/s40854-020-00205-1?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Liudmila Zavolokina & Mateusz Dolata & Gerhard Schwabe, 2016. "The FinTech phenomenon: antecedents of financial innovation perceived by the popular press," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 2(1), pages 1-16, December.
    2. Schneider, Matthew J. & Gupta, Sachin, 2016. "Forecasting sales of new and existing products using consumer reviews: A random projections approach," International Journal of Forecasting, Elsevier, vol. 32(2), pages 243-256.
    3. Yuan Song & Hongwei Wang & Maoran Zhu, 2018. "Sustainable strategy for corporate governance based on the sentiment analysis of financial reports with CSR," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 4(1), pages 1-14, December.
    4. Wataru Souma & Irena Vodenska & Hideaki Aoyama, 2019. "Enhanced news sentiment analysis using deep learning methods," Journal of Computational Social Science, Springer, vol. 2(1), pages 33-46, January.
    5. David Bholat & Stephen Hans & Pedro Santos & Cheryl Schonhardt-Bailey, 2015. "Text mining for central banks," Handbooks, Centre for Central Banking Studies, Bank of England, number 33, April.
    6. Wen, Fenghua & Xu, Longhao & Ouyang, Guangda & Kou, Gang, 2019. "Retail investor attention and stock price crash risk: Evidence from China," International Review of Financial Analysis, Elsevier, vol. 65(C).
    7. Ingrid E. Fisher & Margaret R. Garnsey & Mark E. Hughes, 2016. "Natural Language Processing in Accounting, Auditing and Finance: A Synthesis of the Literature with a Roadmap for Future Research," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 23(3), pages 157-214, July.
    8. Krstić, Živko & Seljan, Sanja & Zoroja, Jovana, 2019. "Visualization of Big Data Text Analytics in Financial Industry: A Case Study of Topic Extraction for Italian Banks," Proceedings of the ENTRENOVA - ENTerprise REsearch InNOVAtion Conference (2019), Rovinj, Croatia, in: Proceedings of the ENTRENOVA - ENTerprise REsearch InNOVAtion Conference, Rovinj, Croatia, 12-14 September 2019, pages 67-75, IRENET - Society for Advancing Innovation and Research in Economy, Zagreb.
    9. Mirjana Pejić Bach & Živko Krstić & Sanja Seljan & Lejla Turulja, 2019. "Text Mining for Big Data Analysis in Financial Sector: A Literature Review," Sustainability, MDPI, vol. 11(5), pages 1-27, February.
    10. Tim Loughran & Bill Mcdonald, 2011. "When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10‐Ks," Journal of Finance, American Finance Association, vol. 66(1), pages 35-65, February.
    11. Craig Lewis & Steven Young, 2019. "Fad or future? Automated analysis of financial text and its implications for corporate reporting," Accounting and Business Research, Taylor & Francis Journals, vol. 49(5), pages 587-615, July.
    12. Chakraborty, Vasundhara & Chiu, Victoria & Vasarhelyi, Miklos, 2014. "Automatic classification of accounting literature," International Journal of Accounting Information Systems, Elsevier, vol. 15(2), pages 122-148.
    13. Fama, Eugene F, 1991. "Efficient Capital Markets: II," Journal of Finance, American Finance Association, vol. 46(5), pages 1575-1617, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Sanaz Ghorbanloo & Sajjad Shokouhyar, 2023. "Consumers' attitude footprint on sustainable development in developed and developing countries: a case study in the electronic industry," Operations Management Research, Springer, vol. 16(3), pages 1444-1475, September.
    2. Dongyoung Kim & Sungwon Jung & Yongwook Jeong, 2021. "Theft Prediction Model Based on Spatial Clustering to Reflect Spatial Characteristics of Adjacent Lands," Sustainability, MDPI, vol. 13(14), pages 1-14, July.
    3. Ahmed, Shamima & Alshater, Muneer M. & Ammari, Anis El & Hammami, Helmi, 2022. "Artificial intelligence and machine learning in finance: A bibliometric review," Research in International Business and Finance, Elsevier, vol. 61(C).
    4. Wenlu Zhao & Guanghu Jin & Chenyue Huang & Jinji Zhang, 2023. "Attention and Sentiment of the Chinese Public toward a 3D Greening System Based on Sina Weibo," IJERPH, MDPI, vol. 20(5), pages 1-20, February.
    5. Christopher Gerling & Stefan Lessmann, 2023. "Multimodal Document Analytics for Banking Process Automation," Papers 2307.11845, arXiv.org, revised Nov 2023.
    6. Yinheng Li & Shaofei Wang & Han Ding & Hang Chen, 2023. "Large Language Models in Finance: A Survey," Papers 2311.10723, arXiv.org.
    7. Kentaka Aruga & Md. Monirul Islam & Yoshihiro Zenno & Arifa Jannat, 2022. "Developing Novel Technique for Investigating Guidelines and Frameworks: A Text Mining Comparison between International and Japanese Green Bonds," JRFM, MDPI, vol. 15(9), pages 1-17, August.
    8. Mazzotta, Stefano, 2022. "Immigration narrative sentiment from TV news and the stock market," Journal of Behavioral and Experimental Finance, Elsevier, vol. 34(C).
    9. Poojan Thakkar & Manan Shah, 2021. "An Assessment of Football Through the Lens of Data Science," Annals of Data Science, Springer, vol. 8(4), pages 823-836, December.
    10. Yuanying Chi & Mingjian Yan & Yuexia Pang & Hongbo Lei, 2022. "Financial Risk Assessment of Photovoltaic Industry Listed Companies Based on Text Mining," Sustainability, MDPI, vol. 14(19), pages 1-17, September.
    11. Li-Chen Cheng & Wei-Ting Lu & Benjamin Yeo, 2023. "Predicting abnormal trading behavior from internet rumor propagation: a machine learning approach," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 9(1), pages 1-23, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Berkin, Anil & Aerts, Walter & Van Caneghem, Tom, 2023. "Feasibility analysis of machine learning for performance-related attributional statements," International Journal of Accounting Information Systems, Elsevier, vol. 48(C).
    2. Andres Algaba & David Ardia & Keven Bluteau & Samuel Borms & Kris Boudt, 2020. "Econometrics Meets Sentiment: An Overview Of Methodology And Applications," Journal of Economic Surveys, Wiley Blackwell, vol. 34(3), pages 512-547, July.
    3. Senave, Elseline & Jans, Mieke J. & Srivastava, Rajendra P., 2023. "The application of text mining in accounting," International Journal of Accounting Information Systems, Elsevier, vol. 50(C).
    4. Rybinski, Krzysztof, 2020. "The forecasting power of the multi-language narrative of sell-side research: A machine learning evaluation," Finance Research Letters, Elsevier, vol. 34(C).
    5. Ahmed, Yousry & Elshandidy, Tamer, 2016. "The effect of bidder conservatism on M&A decisions: Text-based evidence from US 10-K filings," International Review of Financial Analysis, Elsevier, vol. 46(C), pages 176-190.
    6. Leif Anders Thorsrud, 2016. "Nowcasting using news topics Big Data versus big bank," Working Papers No 6/2016, Centre for Applied Macro- and Petroleum economics (CAMP), BI Norwegian Business School.
    7. Pastwa, Anna M. & Shrestha, Prabal & Thewissen, James & Torsin, Wouter, 2021. "Unpacking the black box of ICO white papers: a topic modeling approach," LIDAM Discussion Papers LFIN 2021018, Université catholique de Louvain, Louvain Finance (LFIN).
    8. Gianluca Anese & Marco Corazza & Michele Costola & Loriana Pelizzon, 2023. "Impact of public news sentiment on stock market index return and volatility," Computational Management Science, Springer, vol. 20(1), pages 1-36, December.
    9. Martin Baumgaertner & Johannes Zahner, 2021. "Whatever it takes to understand a central banker - Embedding their words using neural networks," MAGKS Papers on Economics 202130, Philipps-Universität Marburg, Faculty of Business Administration and Economics, Department of Economics (Volkswirtschaftliche Abteilung).
    10. Oehler, Andreas & Schmitz, Jonas Tobias, 2021. "Does intensified communication of hedge funds with letters affect abnormal returns?," International Review of Economics & Finance, Elsevier, vol. 76(C), pages 127-142.
    11. Youngjoon Lee & Soohyon Kim & Ki Young Park, 2018. "Deciphering Monetary Policy Committee Minutes with Text Mining Approach: A Case of South Korea," Working papers 2018rwp-132, Yonsei University, Yonsei Economics Research Institute.
    12. Johannes Zahner, 2020. "Above, but close to two percent. Evidence on the ECB’s inflation target using text mining," MAGKS Papers on Economics 202046, Philipps-Universität Marburg, Faculty of Business Administration and Economics, Department of Economics (Volkswirtschaftliche Abteilung).
    13. Kumar, Rahul & Deb, Soumya Guha & Mukherjee, Shubhadeep, 2020. "Do words reveal the latent truth? Identifying communication patterns of corporate losers," Journal of Behavioral and Experimental Finance, Elsevier, vol. 26(C).
    14. Stefan Angrick & Naoyuki Yoshino, 2020. "From Window Guidance to Interbank Rates: Tracing the Transition of Monetary Policy in Japan and China," International Journal of Central Banking, International Journal of Central Banking, vol. 16(3), pages 279-316, June.
    15. Rybinski, Krzysztof, 2021. "Ranking professional forecasters by the predictive power of their narratives," International Journal of Forecasting, Elsevier, vol. 37(1), pages 186-204.
    16. Shuangyan Li & Guangrui Wang & Yongli Luo, 2022. "Tone of language, financial disclosure, and earnings management: a textual analysis of form 20-F," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 8(1), pages 1-24, December.
    17. R. Erasmus & H. Hollander, 2020. "A Forward Guidance Indicator For The South African Reserve Bank: Implementing A Text Analysis Algorithm," Studies in Economics and Econometrics, Taylor & Francis Journals, vol. 44(3), pages 41-72, December.
    18. Goodell, John W. & Kumar, Satish & Lim, Weng Marc & Pattnaik, Debidutta, 2021. "Artificial intelligence and machine learning in finance: Identifying foundations, themes, and research clusters from bibliometric analysis," Journal of Behavioral and Experimental Finance, Elsevier, vol. 32(C).
    19. Ingrid E. Fisher & Margaret R. Garnsey & Mark E. Hughes, 2016. "Natural Language Processing in Accounting, Auditing and Finance: A Synthesis of the Literature with a Roadmap for Future Research," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 23(3), pages 157-214, July.
    20. Vegard H�ghaug Larsen & Leif Anders Thorsrud, 2018. "Business cycle narratives," Working Papers No 6/2018, Centre for Applied Macro- and Petroleum economics (CAMP), BI Norwegian Business School.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:fininn:v:6:y:2020:i:1:d:10.1186_s40854-020-00205-1. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.