IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2104.11783.html
   My bibliography  Save this paper

Form 10-Q Itemization

Author

Listed:
  • Yanci Zhang
  • Tianming Du
  • Yujie Sun
  • Lawrence Donohue
  • Rui Dai

Abstract

The quarterly financial statement, or Form 10-Q, is one of the most frequently required filings for US public companies to disclose financial and other important business information. Due to the massive volume of 10-Q filings and the enormous variations in the reporting format, it has been a long-standing challenge to retrieve item-specific information from 10-Q filings that lack machine-readable hierarchy. This paper presents a solution for itemizing 10-Q files by complementing a rule-based algorithm with a Convolutional Neural Network (CNN) image classifier. This solution demonstrates a pipeline that can be generalized to a rapid data retrieval solution among a large volume of textual data using only typographic items. The extracted textual data can be used as unlabeled content-specific data to train transformer models (e.g., BERT) or fit into various field-focus natural language processing (NLP) applications.

Suggested Citation

  • Yanci Zhang & Tianming Du & Yujie Sun & Lawrence Donohue & Rui Dai, 2021. "Form 10-Q Itemization," Papers 2104.11783, arXiv.org, revised Oct 2021.
  • Handle: RePEc:arx:papers:2104.11783
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2104.11783
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Dyer, Travis & Lang, Mark & Stice-Lawrence, Lorien, 2017. "The evolution of 10-K textual disclosure: Evidence from Latent Dirichlet Allocation," Journal of Accounting and Economics, Elsevier, vol. 64(2), pages 221-245.
    2. Tim Loughran & Bill Mcdonald, 2011. "When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10‐Ks," Journal of Finance, American Finance Association, vol. 66(1), pages 35-65, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Yanci Zhang & Yutong Lu & Haitao Mao & Jiawei Huang & Cien Zhang & Xinyi Li & Rui Dai, 2023. "Company Competition Graph," Papers 2304.00323, arXiv.org.
    2. Yanci Zhang & Mengjia Xia & Mingyang Li & Haitao Mao & Yutong Lu & Yupeng Lan & Jinlin Ye & Rui Dai, 2023. "Form 10-K Itemization," Papers 2303.04688, arXiv.org.
    3. Liao Zhu & Haoxuan Wu & Martin T. Wells, 2021. "A News-based Machine Learning Model for Adaptive Asset Pricing," Papers 2106.07103, arXiv.org.
    4. Liao Zhu, 2021. "The Adaptive Multi-Factor Model and the Financial Market," Papers 2107.14410, arXiv.org, revised Aug 2021.
    5. Liao Zhu & Ningning Sun & Martin T. Wells, 2022. "Clustering Structure of Microstructure Measures," Applied Economics and Finance, Redfame publishing, vol. 9(1), pages 85-95, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Pastwa, Anna M. & Shrestha, Prabal & Thewissen, James & Torsin, Wouter, 2021. "Unpacking the black box of ICO white papers: a topic modeling approach," LIDAM Discussion Papers LFIN 2021018, Université catholique de Louvain, Louvain Finance (LFIN).
    2. Allen H. Huang & Jianghua Shen & Amy Y. Zang, 2022. "The unintended benefit of the risk factor mandate of 2005," Review of Accounting Studies, Springer, vol. 27(4), pages 1319-1355, December.
    3. James P. Ryans, 2021. "Textual classification of SEC comment letters," Review of Accounting Studies, Springer, vol. 26(1), pages 37-80, March.
    4. Alex Kim & Maximilian Muhn & Valeri Nikolaev, 2023. "Bloated Disclosures: Can ChatGPT Help Investors Process Information?," Papers 2306.10224, arXiv.org, revised Feb 2024.
    5. Berkin, Anil & Aerts, Walter & Van Caneghem, Tom, 2023. "Feasibility analysis of machine learning for performance-related attributional statements," International Journal of Accounting Information Systems, Elsevier, vol. 48(C).
    6. Rjiba, Hatem & Saadi, Samir & Boubaker, Sabri & Ding, Xiaoya (Sara), 2021. "Annual report readability and the cost of equity capital," Journal of Corporate Finance, Elsevier, vol. 67(C).
    7. Travis Dyer & Eunjee Kim, 2021. "Anonymous Equity Research," Journal of Accounting Research, Wiley Blackwell, vol. 59(2), pages 575-611, May.
    8. Nerissa C. Brown & Richard M. Crowley & W. Brooke Elliott, 2020. "What Are You Saying? Using topic to Detect Financial Misreporting," Journal of Accounting Research, Wiley Blackwell, vol. 58(1), pages 237-291, March.
    9. Blankespoor, Elizabeth & deHaan, Ed & Marinovic, Iván, 2020. "Disclosure processing costs, investors’ information choice, and equity market outcomes: A review," Journal of Accounting and Economics, Elsevier, vol. 70(2).
    10. Durnev, Art & Mangen, Claudine, 2020. "The spillover effects of MD&A disclosures for real investment: The role of industry competition," Journal of Accounting and Economics, Elsevier, vol. 70(1).
    11. Liu, Qigui & Wang, Junyi & Chi, Wenqiang, 2022. "The spillover effects of innovation content disclosure in MD&A," Pacific-Basin Finance Journal, Elsevier, vol. 76(C).
    12. Neu, Dean & Saxton, Greg & Rahaman, Abu & Everett, Jeffery, 2019. "Twitter and social accountability: Reactions to the Panama Papers," CRITICAL PERSPECTIVES ON ACCOUNTING, Elsevier, vol. 61(C), pages 38-53.
    13. Wei, Lu & Jing, Haozhe & Huang, Jie & Deng, Yuqi & Jing, Zhongbo, 2023. "Do textual risk disclosures reveal corporate risk? Evidence from U.S. fintech corporations," Economic Modelling, Elsevier, vol. 127(C).
    14. Soliman, Marwa & Ben-Amar, Walid, 2022. "Corporate social responsibility orientation and textual features of financial disclosures," International Review of Financial Analysis, Elsevier, vol. 84(C).
    15. Eddy Cardinaels & Stephan Hollander & Brian J. White, 2019. "Automatic summarization of earnings releases: attributes and effects on investors’ judgments," Review of Accounting Studies, Springer, vol. 24(3), pages 860-890, September.
    16. Farrell, Michael & Murphy, Dermot & Painter, Marcus & Zhang, Guangli, 2023. "The complexity yield puzzle: A textual analysis of municipal bond disclosures," Working Papers 338, The University of Chicago Booth School of Business, George J. Stigler Center for the Study of the Economy and the State.
    17. Li, Ken, 2022. "Textual fundamentals in earnings press releases," Advances in accounting, Elsevier, vol. 57(C).
    18. Li, Jing & Li, Nan & Xia, Tongshui & Guo, Jinjin, 2023. "Textual analysis and detection of financial fraud: Evidence from Chinese manufacturing firms," Economic Modelling, Elsevier, vol. 126(C).
    19. Paul Hribar & Richard Mergenthaler & Aaron Roeschley & Spencer Young & Chris X. Zhao, 2022. "Do Managers Issue More Voluntary Disclosure When GAAP Limits Their Reporting Discretion in Financial Statements?," Journal of Accounting Research, Wiley Blackwell, vol. 60(1), pages 299-351, March.
    20. Henry, Elaine & Thewissen, James & Torsin, Wouter, 2021. "International Earnings Announcements: Tone, Forward-looking Statements, and Informativeness," LIDAM Discussion Papers LFIN 2021016, Université catholique de Louvain, Louvain Finance (LFIN).

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2104.11783. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.