IDEAS home Printed from https://ideas.repec.org/a/eee/ijoais/v50y2023ics1467089523000167.html
   My bibliography  Save this article

The application of text mining in accounting

Author

Listed:
  • Senave, Elseline
  • Jans, Mieke J.
  • Srivastava, Rajendra P.

Abstract

By facilitating the derivation of knowledge and qualitative measures from textual data, text mining techniques have come into vogue in various domains and industries. Namely in accounting, text mining outputs can elucidate, complement, and validate the customary quantitative data. This study creates an up-to-date view of text mining applications in accounting practice. Through a critical review of text mining literature, insight is given into the stages of a typical text mining process, contemporary text mining techniques that have been named valuable in an accounting context, and the information that can be obtained by applying these techniques.

Suggested Citation

  • Senave, Elseline & Jans, Mieke J. & Srivastava, Rajendra P., 2023. "The application of text mining in accounting," International Journal of Accounting Information Systems, Elsevier, vol. 50(C).
  • Handle: RePEc:eee:ijoais:v:50:y:2023:i:c:s1467089523000167
    DOI: 10.1016/j.accinf.2023.100624
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S1467089523000167
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.accinf.2023.100624?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Lo, Kin & Ramos, Felipe & Rogo, Rafael, 2017. "Earnings management and annual report readability," Journal of Accounting and Economics, Elsevier, vol. 63(1), pages 1-25.
    2. Bonsall, Samuel B. & Leone, Andrew J. & Miller, Brian P. & Rennekamp, Kristina, 2017. "A plain English measure of financial reporting readability," Journal of Accounting and Economics, Elsevier, vol. 63(2), pages 329-357.
    3. Sunita Goel & Jagdish Gangolly, 2012. "Beyond The Numbers: Mining The Annual Reports For Hidden Cues Indicative Of Financial Statement Fraud," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 19(2), pages 75-89, April.
    4. Lynnette Purda & David Skillicorn, 2015. "Accounting Variables, Deception, and a Bag of Words: Assessing the Tools of Fraud Detection," Contemporary Accounting Research, John Wiley & Sons, vol. 32(3), pages 1193-1223, September.
    5. Tim Loughran & Bill Mcdonald, 2014. "Measuring Readability in Financial Disclosures," Journal of Finance, American Finance Association, vol. 69(4), pages 1643-1671, August.
    6. Kathleen Weiss Hanley, 2010. "The Information Content of IPO Prospectuses," Review of Financial Studies, Society for Financial Studies, vol. 23(7), pages 2821-2864, July.
    7. Chansog (Francis) Kim & Ke Wang & Liandong Zhang, 2019. "Readability of 10‐K Reports and Stock Price Crash Risk," Contemporary Accounting Research, John Wiley & Sons, vol. 36(2), pages 1184-1216, June.
    8. repec:eme:jaar00:jaar-01-2018-0016 is not listed on IDEAS
    9. Kearney, Colm & Liu, Sha, 2014. "Textual sentiment in finance: A survey of methods and models," International Review of Financial Analysis, Elsevier, vol. 33(C), pages 171-185.
    10. Lu Wei & Guowen Li & Xiaoqian Zhu & Jianping Li, 2019. "Discovering bank risk factors from financial statements based on a new semi‐supervised text mining algorithm," Accounting and Finance, Accounting and Finance Association of Australia and New Zealand, vol. 59(3), pages 1519-1552, September.
    11. Ingrid E. Fisher & Margaret R. Garnsey & Mark E. Hughes, 2016. "Natural Language Processing in Accounting, Auditing and Finance: A Synthesis of the Literature with a Roadmap for Future Research," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 23(3), pages 157-214, July.
    12. Alzamil, Zamil & Appelbaum, Deniz & Nehmer, Robert, 2020. "An ontological artifact for classifying social media: Text mining analysis for financial data," International Journal of Accounting Information Systems, Elsevier, vol. 38(C).
    13. Angela K. Davis & Jeremy M. Piger & Lisa M. Sedor, 2012. "Beyond the Numbers: Measuring the Information Content of Earnings Press Release Language," Contemporary Accounting Research, John Wiley & Sons, vol. 29(3), pages 845-868, September.
    14. Mirjana Pejić Bach & Živko Krstić & Sanja Seljan & Lejla Turulja, 2019. "Text Mining for Big Data Analysis in Financial Sector: A Literature Review," Sustainability, MDPI, vol. 11(5), pages 1-27, February.
    15. Brian J. Bushee & Ian D. Gow & Daniel J. Taylor, 2018. "Linguistic Complexity in Firm Disclosures: Obfuscation or Information?," Journal of Accounting Research, Wiley Blackwell, vol. 56(1), pages 85-121, March.
    16. Li, Feng, 2008. "Annual report readability, current earnings, and earnings persistence," Journal of Accounting and Economics, Elsevier, vol. 45(2-3), pages 221-247, August.
    17. Liu, Baixiao & McConnell, John J., 2013. "The role of the media in corporate governance: Do the media influence managers' capital allocation decisions?," Journal of Financial Economics, Elsevier, vol. 110(1), pages 1-17.
    18. Chou, Chi-Chun & Chang, C. Janie & Peng, Jacob, 2016. "Integrating XBRL data with textual information in Chinese: A semantic web approach," International Journal of Accounting Information Systems, Elsevier, vol. 21(C), pages 32-46.
    19. Yang Bao & Anindya Datta, 2014. "Simultaneously Discovering and Quantifying Risk Types from Textual Risk Disclosures," Management Science, INFORMS, vol. 60(6), pages 1371-1391, June.
    20. James Doran & David Peterson & S. Price, 2012. "Earnings Conference Call Content and Stock Price: The Case of REITs," The Journal of Real Estate Finance and Economics, Springer, vol. 45(2), pages 402-434, August.
    21. Paul C. Tetlock & Maytal Saar‐Tsechansky & Sofus Macskassy, 2008. "More Than Words: Quantifying Language to Measure Firms' Fundamentals," Journal of Finance, American Finance Association, vol. 63(3), pages 1437-1467, June.
    22. Tim Loughran & Bill Mcdonald, 2016. "Textual Analysis in Accounting and Finance: A Survey," Journal of Accounting Research, Wiley Blackwell, vol. 54(4), pages 1187-1230, September.
    23. Tim Loughran & Bill McDonald, 2014. "Regulation and financial disclosure: The impact of plain English," Journal of Regulatory Economics, Springer, vol. 45(1), pages 94-113, February.
    24. Mark A. Clatworthy & Michael John Jones, 2006. "Differential patterns of textual characteristics and company performance in the chairman's statement," Accounting, Auditing & Accountability Journal, Emerald Group Publishing Limited, vol. 19(4), pages 493-511, July.
    25. Joseph F. Brazel & Keith L. Jones & Mark F. Zimbelman, 2009. "Using Nonfinancial Measures to Assess Fraud Risk," Journal of Accounting Research, Wiley Blackwell, vol. 47(5), pages 1135-1166, December.
    26. Kristian D. Allee & Matthew D. Deangelis, 2015. "The Structure of Voluntary Disclosure Narratives: Evidence from Tone Dispersion," Journal of Accounting Research, Wiley Blackwell, vol. 53(2), pages 241-274, May.
    27. Rong Yang & Yang Yu & Manlu Liu & Kean Wu, 2018. "Corporate Risk Disclosure and Audit Fee: A Text Mining Approach," European Accounting Review, Taylor & Francis Journals, vol. 27(3), pages 583-594, May.
    28. Hoberg, Gerard & Lewis, Craig, 2017. "Do fraudulent firms produce abnormal disclosure?," Journal of Corporate Finance, Elsevier, vol. 43(C), pages 58-85.
    29. Eddy Cardinaels & Stephan Hollander & Brian J. White, 2019. "Automatic summarization of earnings releases: attributes and effects on investors’ judgments," Review of Accounting Studies, Springer, vol. 24(3), pages 860-890, September.
    30. Paul C. Tetlock, 2007. "Giving Content to Investor Sentiment: The Role of Media in the Stock Market," Journal of Finance, American Finance Association, vol. 62(3), pages 1139-1168, June.
    31. Werner Antweiler & Murray Z. Frank, 2004. "Is All That Talk Just Noise? The Information Content of Internet Stock Message Boards," Journal of Finance, American Finance Association, vol. 59(3), pages 1259-1294, June.
    32. Tim Loughran & Bill McDonald, 2020. "Textual Analysis in Finance," Annual Review of Financial Economics, Annual Reviews, vol. 12(1), pages 357-375, December.
    33. Cardinaels, Eddy & Hollander, Stephan & White, Brian, 2019. "Automatic summarization of earnings releases : Attributes and effects on investors’ judgments," Other publications TiSEM 721f64f4-033e-453b-a3e7-2, Tilburg University, School of Economics and Management.
    34. Kris Boudt & James Thewissen, 2019. "Jockeying for Position in CEO Letters: Impression Management and Sentiment Analytics," Financial Management, Financial Management Association International, vol. 48(1), pages 77-115, March.
    35. Hutchison, Paul D. & Daigle, Ronald J. & George, Benjamin, 2018. "Application of latent semantic analysis in AIS academic research," International Journal of Accounting Information Systems, Elsevier, vol. 31(C), pages 83-96.
    36. de Souza, João Antônio Salvador & Rissatti, Jean Carlo & Rover, Suliani & Borba, José Alonso, 2019. "The linguistic complexities of narrative accounting disclosure on financial statements: An analysis based on readability characteristics," Research in International Business and Finance, Elsevier, vol. 48(C), pages 59-74.
    37. Tim Loughran & Bill Mcdonald, 2011. "When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10‐Ks," Journal of Finance, American Finance Association, vol. 66(1), pages 35-65, February.
    38. Boritz, J. Efrim & Hayes, Louise & Lim, Jee-Hae, 2013. "A content analysis of auditors' reports on IT internal control weaknesses: The comparative advantages of an automated approach to control weakness identification," International Journal of Accounting Information Systems, Elsevier, vol. 14(2), pages 138-163.
    39. Nerissa C. Brown & Richard M. Crowley & W. Brooke Elliott, 2020. "What Are You Saying? Using topic to Detect Financial Misreporting," Journal of Accounting Research, Wiley Blackwell, vol. 58(1), pages 237-291, March.
    40. John L. Campbell & Hye Seung “Grace” Lee & Hsin‐Min Lu & Logan B. Steele, 2020. "Express Yourself: Why Managers' Disclosure Tone Varies Across Time and What Investors Learn From It," Contemporary Accounting Research, John Wiley & Sons, vol. 37(2), pages 1140-1171, June.
    41. Alireza Rahrovi Dastjerdi & Daruosh Foroghi & Gholam Hossain Kiani, 2019. "Detecting manager’s fraud risk using text analysis: evidence from Iran," Journal of Applied Accounting Research, Emerald Group Publishing Limited, vol. 20(2), pages 154-171, June.
    42. Li, Jingyu & Li, Jianping & Zhu, Xiaoqian, 2020. "Risk dependence between energy corporations: A text-based measurement approach," International Review of Economics & Finance, Elsevier, vol. 68(C), pages 33-46.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Tim Loughran & Bill Mcdonald, 2016. "Textual Analysis in Accounting and Finance: A Survey," Journal of Accounting Research, Wiley Blackwell, vol. 54(4), pages 1187-1230, September.
    2. Richard Frankel & Jared Jennings & Joshua Lee, 2022. "Disclosure Sentiment: Machine Learning vs. Dictionary Methods," Management Science, INFORMS, vol. 68(7), pages 5514-5532, July.
    3. Ingrid E. Fisher & Margaret R. Garnsey & Mark E. Hughes, 2016. "Natural Language Processing in Accounting, Auditing and Finance: A Synthesis of the Literature with a Roadmap for Future Research," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 23(3), pages 157-214, July.
    4. Blankespoor, Elizabeth & deHaan, Ed & Marinovic, Iván, 2020. "Disclosure processing costs, investors’ information choice, and equity market outcomes: A review," Journal of Accounting and Economics, Elsevier, vol. 70(2).
    5. Christina Bannier & Thomas Pauls & Andreas Walter, 2019. "Content analysis of business communication: introducing a German dictionary," Journal of Business Economics, Springer, vol. 89(1), pages 79-123, February.
    6. Berkin, Anil & Aerts, Walter & Van Caneghem, Tom, 2023. "Feasibility analysis of machine learning for performance-related attributional statements," International Journal of Accounting Information Systems, Elsevier, vol. 48(C).
    7. Nadine Gatzert & Dinah Heidinger, 2020. "An Empirical Analysis of Market Reactions to the First Solvency and Financial Condition Reports in the European Insurance Industry," Journal of Risk & Insurance, The American Risk and Insurance Association, vol. 87(2), pages 407-436, June.
    8. Bian, Shibo & Jia, Dekui & Li, Ruihai & Sun, Wujun & Yan, Zhipeng & Zheng, Yingfei, 2021. "Can management tone predict IPO performance? – Evidence from mandatory online roadshows in China," Pacific-Basin Finance Journal, Elsevier, vol. 68(C).
    9. Andres Algaba & David Ardia & Keven Bluteau & Samuel Borms & Kris Boudt, 2020. "Econometrics Meets Sentiment: An Overview Of Methodology And Applications," Journal of Economic Surveys, Wiley Blackwell, vol. 34(3), pages 512-547, July.
    10. Yan Luo & Linying Zhou, 2020. "Textual tone in corporate financial disclosures: a survey of the literature," International Journal of Disclosure and Governance, Palgrave Macmillan, vol. 17(2), pages 101-110, September.
    11. Simon Fritzsch & Philipp Scharner & Gregor Weiß, 2021. "Estimating the relation between digitalization and the market value of insurers," Journal of Risk & Insurance, The American Risk and Insurance Association, vol. 88(3), pages 529-567, September.
    12. Frankel, Richard & Jennings, Jared & Lee, Joshua, 2016. "Using unstructured and qualitative disclosures to explain accruals," Journal of Accounting and Economics, Elsevier, vol. 62(2), pages 209-227.
    13. Jia, Jing & Li, Zhongtian, 2022. "Risk management committees and readability of risk management disclosure," Journal of Contemporary Accounting and Economics, Elsevier, vol. 18(3).
    14. Buehlmaier, Matthias M. M. & Zechner, Josef, 2016. "Financial media, price discovery, and merger arbitrage," CFS Working Paper Series 551, Center for Financial Studies (CFS).
    15. Fengler, Matthias & Phan, Minh Tri, 2023. "A Topic Model for 10-K Management Disclosures," Economics Working Paper Series 2307, University of St. Gallen, School of Economics and Political Science.
    16. Daniele Ballinari & Simon Behrendt, 2021. "How to gauge investor behavior? A comparison of online investor sentiment measures," Digital Finance, Springer, vol. 3(2), pages 169-204, June.
    17. Smith, Kecia Williams, 2023. "Tell Me More: A content analysis of expanded auditor reporting in the United Kingdom," Accounting, Organizations and Society, Elsevier, vol. 108(C).
    18. Bassyouny, Hesham & Abdelfattah, Tarek & Tao, Lei, 2022. "Narrative disclosure tone: A review and areas for future research," Journal of International Accounting, Auditing and Taxation, Elsevier, vol. 49(C).
    19. Muhammad Farhan Malik & Yuan George Shan & Jamie Yixing Tong, 2022. "Do auditors price litigious tone?," Accounting and Finance, Accounting and Finance Association of Australia and New Zealand, vol. 62(S1), pages 1715-1760, April.
    20. James P. Ryans, 2021. "Textual classification of SEC comment letters," Review of Accounting Studies, Springer, vol. 26(1), pages 37-80, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ijoais:v:50:y:2023:i:c:s1467089523000167. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: https://www.journals.elsevier.com/international-journal-of-accounting-information-systems/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.