IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2504.07733.html

DeepGreen: Effective LLM-Driven Greenwashing Monitoring System Designed for Empirical Testing -- Evidence from China

Author

Listed:
  • Congluo Xu
  • Jiuyue Liu
  • Ziyang Li
  • Chengmengjia Lin

Abstract

Motivated by the emerging adoption of Large Language Models (LLMs) in economics and management research, this paper investigates whether LLMs can reliably identify corporate greenwashing narratives and, more importantly, whether and how the greenwashing signals extracted from textual disclosures can be used to empirically identify causal effects. To this end, this paper proposes DeepGreen, a dual-stage LLM-Driven system for detecting potential corporate greenwashing in annual reports. Applied to 9369 A-share annual reports published between 2021 and 2023, DeepGreen attains high reliability in random-sample validation at both stages. Ablation experiment shows that Retrieval-Augmented Generation (RAG) reduces hallucinations, as compared to simply lengthening the input window. Empirical tests indicate that "greenwashing" captured by DeepGreen can effectively reveal a positive relationship between greenwashing and environmental penalties, and IV, PSM, Placebo test, which enhance the robustness and causal effects of the empirical evidence. Further study suggests that the presence and number of green investors can weaken the positive correlation between greenwashing and penalties. Heterogeneity analysis shows that the positive relationship between "greenwashing - penalty" is less significant in large-sized corporations and corporations that have accumulated green assets, indicating that these green assets may be exploited as a credibility shield for greenwashing. Our findings demonstrate that LLMs can standardize ESG oversight by early warning and direct regulators' scarce attention toward the subsets of corporations where monitoring is more warranted.

Suggested Citation

  • Congluo Xu & Jiuyue Liu & Ziyang Li & Chengmengjia Lin, 2025. "DeepGreen: Effective LLM-Driven Greenwashing Monitoring System Designed for Empirical Testing -- Evidence from China," Papers 2504.07733, arXiv.org, revised Jan 2026.
  • Handle: RePEc:arx:papers:2504.07733
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2504.07733
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Asay, H. Scott & Libby, Robert & Rennekamp, Kristina, 2018. "Firm performance, reporting goals, and language choices in narrative disclosures," Journal of Accounting and Economics, Elsevier, vol. 65(2), pages 380-398.
    2. Wood, Katherine & Pyun, Chaehyun & Pham, Hieu, 2025. "Beyond Green Labels: Assessing Mutual Funds’ ESG Commitments through Large Language Models," Finance Research Letters, Elsevier, vol. 74(C).
    3. Bart Frijns & Dimitris Margaritis & Maria Psillaki, 2012. "Firm efficiency and stock returns," Journal of Productivity Analysis, Springer, vol. 37(3), pages 295-306, June.
    4. Tran, Duc Hung, 2014. "Multiple corporate governance attributes and the cost of capital – Evidence from Germany," The British Accounting Review, Elsevier, vol. 46(2), pages 179-197.
    5. Wang, Qishu, 2025. "Generative AI-assisted evaluation of ESG practices and information delays in ESG ratings," Finance Research Letters, Elsevier, vol. 74(C).
    6. Tim Loughran & Bill Mcdonald, 2016. "Textual Analysis in Accounting and Finance: A Survey," Journal of Accounting Research, John Wiley & Sons, Ltd., vol. 54(4), pages 1187-1230, September.
    7. Dyer, Travis & Lang, Mark & Stice-Lawrence, Lorien, 2017. "The evolution of 10-K textual disclosure: Evidence from Latent Dirichlet Allocation," Journal of Accounting and Economics, Elsevier, vol. 64(2), pages 221-245.
    8. Christine A. Botosan & Marlene A. Plumlee, 2002. "A Re‐examination of Disclosure Level and the Expected Cost of Equity Capital," Journal of Accounting Research, John Wiley & Sons, Ltd., vol. 40(1), pages 21-40, March.
    9. Matthew J. Spaniol & Evita Danilova-Jensen & Martin Nielsen & Carl Gyldenkærne Rosdahl & Clara Jasmin Schmidt, 2024. "Defining Greenwashing: A Concept Analysis," Sustainability, MDPI, vol. 16(20), pages 1-17, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hua Sun & Hongkang Xu, 2026. "Corporate social responsibility and data visualization in 10-K filings," Review of Quantitative Finance and Accounting, Springer, vol. 66(1), pages 201-233, January.
    2. Everett, Jeff & Shiraz Rahaman, Abu & Neu, Dean & Saxton, Gregory, 2024. "Letters to the editor, institutional experimentation, and the public accounting professional," CRITICAL PERSPECTIVES ON ACCOUNTING, Elsevier, vol. 99(C).
    3. Chychyla, Roman & Leone, Andrew J. & Minutti-Meza, Miguel, 2019. "Complexity of financial reporting standards and accounting expertise," Journal of Accounting and Economics, Elsevier, vol. 67(1), pages 226-253.
    4. Elsayed, Mohamed & Elshandidy, Tamer, 2021. "Internal control effectiveness, textual risk disclosure, and their usefulness: U.S. evidence," Advances in accounting, Elsevier, vol. 53(C).
    5. Pastwa, Anna M. & Shrestha, Prabal & Thewissen, James & Torsin, Wouter, 2021. "Unpacking the black box of ICO white papers: a topic modeling approach," LIDAM Discussion Papers LFIN 2021018, Université catholique de Louvain, Louvain Finance (LFIN).
    6. James P. Ryans, 2021. "Textual classification of SEC comment letters," Review of Accounting Studies, Springer, vol. 26(1), pages 37-80, March.
    7. M. J. Histen, 2022. "Taking Information Seriously: A Firm-side Interpretation of Risk Factor Disclosure," International Advances in Economic Research, Springer;International Atlantic Economic Society, vol. 28(3), pages 119-131, November.
    8. Alex Kim & Maximilian Muhn & Valeri Nikolaev, 2023. "Bloated Disclosures: Can ChatGPT Help Investors Process Information?," Papers 2306.10224, arXiv.org, revised Oct 2025.
    9. Berkin, Anil & Aerts, Walter & Van Caneghem, Tom, 2023. "Feasibility analysis of machine learning for performance-related attributional statements," International Journal of Accounting Information Systems, Elsevier, vol. 48(C).
    10. Rjiba, Hatem & Saadi, Samir & Boubaker, Sabri & Ding, Xiaoya (Sara), 2021. "Annual report readability and the cost of equity capital," Journal of Corporate Finance, Elsevier, vol. 67(C).
    11. Liu, Cheng & Dong, Siyuan & Gao, Xinyi, 2026. "Does policy-oriented environmental disclosure increase market uncertainty? Evidence from stock price volatility in China," Research in International Business and Finance, Elsevier, vol. 81(C).
    12. Hong, Eunpyo & Kottimukkalur, Badrinath & Noh, Joonki, 2026. "Uncertain Text and Price Reactions to Earnings Releases," Journal of Banking & Finance, Elsevier, vol. 182(C).
    13. Ibrahim El-Sayed Ebaid, 2023. "IFRS adoption and the readability of corporate annual reports: evidence from an emerging market," Future Business Journal, Springer, vol. 9(1), pages 1-12, December.
    14. Nerissa C. Brown & Richard M. Crowley & W. Brooke Elliott, 2020. "What Are You Saying? Using topic to Detect Financial Misreporting," Journal of Accounting Research, John Wiley & Sons, Ltd., vol. 58(1), pages 237-291, March.
    15. Hyunkwon Cho & Robert Kim, 2021. "Asymmetric effects of voluntary disclosure on stock liquidity: evidence from 8‐K filings," Accounting and Finance, Accounting and Finance Association of Australia and New Zealand, vol. 61(1), pages 803-846, March.
    16. Blankespoor, Elizabeth & deHaan, Ed & Marinovic, Iván, 2020. "Disclosure processing costs, investors’ information choice, and equity market outcomes: A review," Journal of Accounting and Economics, Elsevier, vol. 70(2).
    17. Durnev, Art & Mangen, Claudine, 2020. "The spillover effects of MD&A disclosures for real investment: The role of industry competition," Journal of Accounting and Economics, Elsevier, vol. 70(1).
    18. Pradhan, Rudra P. & Samarakoon, S.M.R.K. & Wijesinghe, B.A.C.H. & Maradana, Rana P., 2025. "Clear or confusing? How financial report readability and tone are associated with dividend payouts in Indian corporations," The Quarterly Review of Economics and Finance, Elsevier, vol. 104(C).
    19. Liu, Qigui & Wang, Junyi & Chi, Wenqiang, 2022. "The spillover effects of innovation content disclosure in MD&A," Pacific-Basin Finance Journal, Elsevier, vol. 76(C).
    20. Neu, Dean & Saxton, Greg & Rahaman, Abu & Everett, Jeffery, 2019. "Twitter and social accountability: Reactions to the Panama Papers," CRITICAL PERSPECTIVES ON ACCOUNTING, Elsevier, vol. 61(C), pages 38-53.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2504.07733. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.