IDEAS home Printed from https://ideas.repec.org/p/zbw/irtgdp/2020007.html
   My bibliography  Save this paper

Deep Learning application for fraud detection in financial statements

Author

Listed:
  • Craja, Patricia
  • Kim, Alisa
  • Lessmann, Stefan

Abstract

Financial statement fraud is an area of significant consternation for potential investors, auditing companies, and state regulators. Intelligent systems facilitate detecting financial statement fraud and assist the decision-making of relevant stakeholders. Previous research detected instances in which financial statements have been fraudulently misrepresented in managerial comments. The paper aims to investigate whether it is possible to develop an enhanced system for detecting financial fraud through the combination of information sourced from financial ratios and managerial comments within corporate annual reports. We employ a hierarchical attention network (HAN) with a long short-term memory (LSTM) encoder to extract text features from the Management Discussion and Analysis (MD&A) section of annual reports. The model is designed to offer two distinct features. First, it reflects the structured hierarchy of documents, which previous models were unable to capture. Second, the model embodies two different attention mechanisms at the word and sentence level, which allows content to be differentiated in terms of its importance in the process of constructing the document representation. As a result of its architecture, the model captures both content and context of managerial comments, which serve as supplementary predictors to financial ratios in the detection of fraudulent reporting. Additionally, the model provides interpretable indicators denoted as “red-flag” sentences, which assist stakeholders in their process of determining whether further investigation of a specific annual report is required. Empirical results demonstrate that textual features of MD&A sections extracted by HAN yield promising classification results and substantially reinforce financial ratios.

Suggested Citation

  • Craja, Patricia & Kim, Alisa & Lessmann, Stefan, 2020. "Deep Learning application for fraud detection in financial statements," IRTG 1792 Discussion Papers 2020-007, Humboldt University of Berlin, International Research Training Group 1792 "High Dimensional Nonstationary Time Series".
  • Handle: RePEc:zbw:irtgdp:2020007
    as

    Download full text from publisher

    File URL: https://www.econstor.eu/bitstream/10419/230813/1/irtg1792dp2020-007.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. David F. Larcker & Anastasia A. Zakolyukina, 2012. "Detecting Deceptive Discussions in Conference Calls," Journal of Accounting Research, Wiley Blackwell, vol. 50(2), pages 495-540, May.
    2. Liu, Chengwei & Chan, Yixiang & Alam Kazmi, Syed Hasnain & Fu, Hao, 2015. "Financial Fraud Detection Model Based on Random Forest," MPRA Paper 65404, University Library of Munich, Germany.
    3. Lynnette Purda & David Skillicorn, 2015. "Accounting Variables, Deception, and a Bag of Words: Assessing the Tools of Fraud Detection," Contemporary Accounting Research, John Wiley & Sons, vol. 32(3), pages 1193-1223, September.
    4. Tim Loughran & Bill Mcdonald, 2014. "Measuring Readability in Financial Disclosures," Journal of Finance, American Finance Association, vol. 69(4), pages 1643-1671, August.
    5. Richardson, Scott A. & Sloan, Richard G. & Soliman, Mark T. & Tuna, Irem, 2005. "Accrual reliability, earnings persistence and stock prices," Journal of Accounting and Economics, Elsevier, vol. 39(3), pages 437-485, September.
    6. Angela K. Davis & Jeremy M. Piger & Lisa M. Sedor, 2012. "Beyond the Numbers: Measuring the Information Content of Earnings Press Release Language," Contemporary Accounting Research, John Wiley & Sons, vol. 29(3), pages 845-868, September.
    7. Alexander Dyck & Adair Morse & Luigi Zingales, 2010. "Who Blows the Whistle on Corporate Fraud?," Journal of Finance, American Finance Association, vol. 65(6), pages 2213-2253, December.
    8. Li, Feng, 2008. "Annual report readability, current earnings, and earnings persistence," Journal of Accounting and Economics, Elsevier, vol. 45(2-3), pages 221-247, August.
    9. Patricia M. Dechow & Weili Ge & Chad R. Larson & Richard G. Sloan, 2011. "Predicting Material Accounting Misstatements," Contemporary Accounting Research, John Wiley & Sons, vol. 28(1), pages 17-82, March.
    10. Bodnaruk, Andriy & Loughran, Tim & McDonald, Bill, 2015. "Using 10-K Text to Gauge Financial Constraints," Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 50(4), pages 623-646, August.
    11. Feng Li, 2010. "The Information Content of Forward‐Looking Statements in Corporate Filings—A Naïve Bayesian Machine Learning Approach," Journal of Accounting Research, Wiley Blackwell, vol. 48(5), pages 1049-1102, December.
    12. Gray, Glen L. & Debreceny, Roger S., 2014. "A taxonomy to guide research on the application of data mining to fraud detection in financial statement audits," International Journal of Accounting Information Systems, Elsevier, vol. 15(4), pages 357-380.
    13. Paul C. Tetlock, 2007. "Giving Content to Investor Sentiment: The Role of Media in the Stock Market," Journal of Finance, American Finance Association, vol. 62(3), pages 1139-1168, June.
    14. Tim Loughran & Bill Mcdonald, 2011. "When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10‐Ks," Journal of Finance, American Finance Association, vol. 66(1), pages 35-65, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Dimitrios Kydros & Michail Pazarskis & Athanasia Karakitsiou, 2022. "A framework for identifying the falsified financial statements using network textual analysis: a general model and the Greek example," Annals of Operations Research, Springer, vol. 316(1), pages 513-527, September.
    2. Goodell, John W. & Kumar, Satish & Lim, Weng Marc & Pattnaik, Debidutta, 2021. "Artificial intelligence and machine learning in finance: Identifying foundations, themes, and research clusters from bibliometric analysis," Journal of Behavioral and Experimental Finance, Elsevier, vol. 32(C).
    3. Li, Jing & Li, Nan & Xia, Tongshui & Guo, Jinjin, 2023. "Textual analysis and detection of financial fraud: Evidence from Chinese manufacturing firms," Economic Modelling, Elsevier, vol. 126(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. James P. Ryans, 2021. "Textual classification of SEC comment letters," Review of Accounting Studies, Springer, vol. 26(1), pages 37-80, March.
    2. Ingrid E. Fisher & Margaret R. Garnsey & Mark E. Hughes, 2016. "Natural Language Processing in Accounting, Auditing and Finance: A Synthesis of the Literature with a Roadmap for Future Research," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 23(3), pages 157-214, July.
    3. Frankel, Richard & Jennings, Jared & Lee, Joshua, 2016. "Using unstructured and qualitative disclosures to explain accruals," Journal of Accounting and Economics, Elsevier, vol. 62(2), pages 209-227.
    4. Tim Loughran & Bill Mcdonald, 2016. "Textual Analysis in Accounting and Finance: A Survey," Journal of Accounting Research, Wiley Blackwell, vol. 54(4), pages 1187-1230, September.
    5. Liu, Pu & Nguyen, Hazel T., 2020. "CEO characteristics and tone at the top inconsistency," Journal of Economics and Business, Elsevier, vol. 108(C).
    6. David F. Larcker & Anastasia A. Zakolyukina, 2012. "Detecting Deceptive Discussions in Conference Calls," Journal of Accounting Research, Wiley Blackwell, vol. 50(2), pages 495-540, May.
    7. Blankespoor, Elizabeth & deHaan, Ed & Marinovic, Iván, 2020. "Disclosure processing costs, investors’ information choice, and equity market outcomes: A review," Journal of Accounting and Economics, Elsevier, vol. 70(2).
    8. Bakarich, Kathleen M. & Hossain, Mahmud & Hossain, Mahmud & Weintrop, Joseph, 2019. "Different time, different tone: Company life cycle," Journal of Contemporary Accounting and Economics, Elsevier, vol. 15(1), pages 69-86.
    9. Wolfgang Breuer & Andreas Knetsch & Astrid Juliane Salzmann, 2020. "What Does It Mean When Managers Talk About Trust?," Journal of Business Ethics, Springer, vol. 166(3), pages 473-488, October.
    10. Richard Frankel & Jared Jennings & Joshua Lee, 2022. "Disclosure Sentiment: Machine Learning vs. Dictionary Methods," Management Science, INFORMS, vol. 68(7), pages 5514-5532, July.
    11. Blau, Benjamin M. & DeLisle, Jared R. & Price, S. McKay, 2015. "Do sophisticated investors interpret earnings conference call tone differently than investors at large? Evidence from short sales," Journal of Corporate Finance, Elsevier, vol. 31(C), pages 203-219.
    12. Senave, Elseline & Jans, Mieke J. & Srivastava, Rajendra P., 2023. "The application of text mining in accounting," International Journal of Accounting Information Systems, Elsevier, vol. 50(C).
    13. Peter M. Clarkson & Jordan Ponn & Gordon D. Richardson & Frank Rudzicz & Albert Tsang & Jingjing Wang, 2020. "A Textual Analysis of US Corporate Social Responsibility Reports," Abacus, Accounting Foundation, University of Sydney, vol. 56(1), pages 3-34, March.
    14. Kelvin K. F. Law & Lillian F. Mills, 2015. "Taxes and Financial Constraints: Evidence from Linguistic Cues," Journal of Accounting Research, Wiley Blackwell, vol. 53(4), pages 777-819, September.
    15. Yan Luo & Linying Zhou, 2020. "Textual tone in corporate financial disclosures: a survey of the literature," International Journal of Disclosure and Governance, Palgrave Macmillan, vol. 17(2), pages 101-110, September.
    16. Jiao Ji & Oleksandr Talavera & Shuxing Yin, 2018. "The Hidden Information Content: Evidence from the Tone of Independent Director Reports," Working Papers 2018-28, Swansea University, School of Management.
    17. Berkin, Anil & Aerts, Walter & Van Caneghem, Tom, 2023. "Feasibility analysis of machine learning for performance-related attributional statements," International Journal of Accounting Information Systems, Elsevier, vol. 48(C).
    18. Brian J. Bushee & Ian D. Gow & Daniel J. Taylor, 2018. "Linguistic Complexity in Firm Disclosures: Obfuscation or Information?," Journal of Accounting Research, Wiley Blackwell, vol. 56(1), pages 85-121, March.
    19. Kristian D. Allee & Matthew D. Deangelis, 2015. "The Structure of Voluntary Disclosure Narratives: Evidence from Tone Dispersion," Journal of Accounting Research, Wiley Blackwell, vol. 53(2), pages 241-274, May.
    20. Christina Bannier & Thomas Pauls & Andreas Walter, 2019. "Content analysis of business communication: introducing a German dictionary," Journal of Business Economics, Springer, vol. 89(1), pages 79-123, February.

    More about this item

    Keywords

    fraud detection; financial statements; deep learning; text analytics;
    All these keywords.

    JEL classification:

    • C00 - Mathematical and Quantitative Methods - - General - - - General

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:zbw:irtgdp:2020007. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ZBW - Leibniz Information Centre for Economics (email available below). General contact details of provider: https://edirc.repec.org/data/wfhubde.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.