IDEAS home Printed from https://ideas.repec.org/p/osf/socarx/cftvk_v1.html

Measuring Corruption from Text Data

Author

Listed:
  • Muço, Arieda

Abstract

Using Brazilian municipal audit reports, I construct an automated corruption index that combines a dictionary of audit irregularities with principal component analysis. The index validates strongly against independent human coders, explaining 71–73 % of the variation in hand-coded corruption counts in samples where coders themselves exhibit high agreement, and the results are robust within these validation samples. The index behaves as theory predicts, correlating with municipal characteristics that prior research links to corruption. Supervised learning alternatives yield nearly identical municipal rankings (R2=0.98), confirming that the dictionary approach captures the same underlying construct. The method scales to the full audit corpus and offers advantages over both manual coding and Large Language Models (LLMs) in transparency, cost, and long-run replicability.

Suggested Citation

  • Muço, Arieda, 2025. "Measuring Corruption from Text Data," SocArXiv cftvk_v1, Center for Open Science.
  • Handle: RePEc:osf:socarx:cftvk_v1
    DOI: 10.31219/osf.io/cftvk_v1
    as

    Download full text from publisher

    File URL: https://osf.io/download/693c2d5d9a7485bff0baf393/
    Download Restriction: no

    File URL: https://libkey.io/10.31219/osf.io/cftvk_v1?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Grimmer, Justin & Stewart, Brandon M., 2013. "Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts," Political Analysis, Cambridge University Press, vol. 21(3), pages 267-297, July.
    2. Filipe R. Campante & Quoc-Anh Do, 2014. "Isolated Capital Cities, Accountability, and Corruption: Evidence from US States," American Economic Review, American Economic Association, vol. 104(8), pages 2456-2481, August.
    3. Timmons, Jeffrey F. & Garfias, Francisco, 2015. "Revealed Corruption, Taxation, and Fiscal Accountability: Evidence from Brazil," World Development, Elsevier, vol. 70(C), pages 13-27.
    4. Torsten Persson & Guido Tabellini & Francesco Trebbi, 2003. "Electoral Rules and Corruption," Journal of the European Economic Association, MIT Press, vol. 1(4), pages 958-989, June.
    5. Zamboni, Yves & Litschig, Stephan, 2018. "Audit risk and rent extraction: Evidence from a randomized evaluation in Brazil," Journal of Development Economics, Elsevier, vol. 134(C), pages 133-149.
    6. Filipe R. Campante & Quoc-Anh Do, 2014. "Isolated Capital Cities, Accountability, and Corruption: Evidence from US States," American Economic Review, American Economic Association, vol. 104(8), pages 2456-2481, August.
    7. Danila Serra, 2006. "Empirical determinants of corruption: A sensitivity analysis," Public Choice, Springer, vol. 126(1), pages 225-256, January.
    8. Torsten Persson & Guido Tabellini, 2002. "Political Economics: Explaining Economic Policy," MIT Press Books, The MIT Press, edition 1, volume 1, number 0262661314, December.
    9. Olken, Benjamin A., 2009. "Corruption perceptions vs. corruption reality," Journal of Public Economics, Elsevier, vol. 93(7-8), pages 950-964, August.
    10. Treisman, Daniel, 2000. "The causes of corruption: a cross-national study," Journal of Public Economics, Elsevier, vol. 76(3), pages 399-457, June.
    11. Claudio Ferraz & Frederico Finan, 2008. "Exposing Corrupt Politicians: The Effects of Brazil's Publicly Released Audits on Electoral Outcomes," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 123(2), pages 703-745.
    12. Eric Avis & Claudio Ferraz & Frederico Finan, 2018. "Do Government Audits Reduce Corruption? Estimating the Impacts of Exposing Corrupt Politicians," Journal of Political Economy, University of Chicago Press, vol. 126(5), pages 1912-1964.
    13. Claudio Ferraz & Frederico Finan, 2011. "Electoral Accountability and Corruption: Evidence from the Audits of Local Governments," American Economic Review, American Economic Association, vol. 101(4), pages 1274-1311, June.
    14. Filipe R. Campante & Quoc-Anh Do, 2014. "Isolated Capital Cities, Accountability, and Corruption: Evidence from US States," American Economic Review, American Economic Association, vol. 104(8), pages 2456-2481, August.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Gans-Morse, Jordan & Borges, Mariana & Makarin, Alexey & Mannah-Blankson, Theresa & Nickow, Andre & Zhang, Dong, 2018. "Reducing bureaucratic corruption: Interdisciplinary perspectives on what works," World Development, Elsevier, vol. 105(C), pages 171-188.
    2. Raveh, Ohad & Tsur, Yacov, 2023. "Can resource windfalls reduce corruption? The role of term limits," Journal of Environmental Economics and Management, Elsevier, vol. 122(C).
    3. Britto, Diogo G.C. & Fiorin, Stefano, 2020. "Corruption and legislature size: Evidence from Brazil," European Journal of Political Economy, Elsevier, vol. 65(C).
    4. Kendall D. Funk & Erica Owen, 2020. "Consequences of an Anti‐Corruption Experiment for Local Government Performance in Brazil," Journal of Policy Analysis and Management, John Wiley & Sons, Ltd., vol. 39(2), pages 444-468, March.
    5. Rajeev K. Goel & Michael A. Nelson, 2021. "Corrupt encounters of the fairer sex: female entrepreneurs and their corruption perceptions/experience," The Journal of Technology Transfer, Springer, vol. 46(6), pages 1973-1994, December.
    6. Hsien-Yi Chen & Sheng-Syan Chen, 2023. "Can credit default swaps exert an enduring monitoring influence on political integrity?," Review of Quantitative Finance and Accounting, Springer, vol. 60(2), pages 445-469, February.
    7. Joshua D. Ammons & Shishir Shakya, 2024. "Revolutions and corruption," Public Choice, Springer, vol. 201(1), pages 355-376, October.
    8. Krisztina Kis-Katos & Günther G. Schulze, 2013. "Corruption in Southeast Asia: a survey of recent research," Asian-Pacific Economic Literature, The Crawford School, The Australian National University, vol. 27(1), pages 79-109, May.
    9. Ilaria De Angelis & Guido de Blasio & Lucia Rizzica, 2018. "On the unintended effects of public transfers: evidence from EU funding to Southern Italy," Temi di discussione (Economic working papers) 1180, Bank of Italy, Economic Research and International Relations Area.
    10. repec:osf:osfxxx:akpdy_v1 is not listed on IDEAS
    11. Jeong, Dahyeon & Shenoy, Ajay & Zimmermann, Laura V., 2023. "De Jure versus De Facto transparency: Corruption in local public office in India," Journal of Public Economics, Elsevier, vol. 221(C).
    12. Afridi, Farzana & Dhillon, Amrita & Chaudhuri, Arka Roy & Kaur, Dashleen, 2020. "Efficacy of Top down audits and Community Monitoring," OSF Preprints akpdy, Center for Open Science.
    13. Campante, Filipe & Du, Rui & Sun, Weizeng & Wang, Jianghao & Zheng, Siqi, 2025. "JUE insight: Political geography and the spatial allocation of economic activity: Evidence from China’s anti-corruption campaign," Journal of Urban Economics, Elsevier, vol. 149(C).
    14. Ilaria Angelis & Guido Blasio & Lucia Rizzica, 2020. "Lost in Corruption. Evidence from EU Funding to Southern Italy," Italian Economic Journal: A Continuation of Rivista Italiana degli Economisti and Giornale degli Economisti, Springer;Società Italiana degli Economisti (Italian Economic Association), vol. 6(2), pages 355-377, July.
    15. Cisneros, Elías & Kis-Katos, Krisztina, 2024. "Unintended environmental consequences of anti-corruption strategies," Journal of Environmental Economics and Management, Elsevier, vol. 128(C).
    16. Joël Cariolle & Petros G Sekeris, 2021. "How export shocks corrupt: theory and evidence," Working Papers hal-03164648, HAL.
    17. Maximiliano Lauletta & Martín A. Rossi & Christian A. Ruzzier, 2022. "Audits and Government Hiring Practices," Economica, London School of Economics and Political Science, vol. 89(353), pages 214-227, January.
    18. Melki, Mickael & Pickering, Andrew, 2020. "Polarization and corruption in America," European Economic Review, Elsevier, vol. 124(C).
    19. Lucas Argentieri Mariani & Mattia Longhi & Silvia Marchesi, 2025. "Reversing the Political Resource Curse: Accountability and Regional Favoritism under Capital Windfalls," Working Papers 552, University of Milano-Bicocca, Department of Economics.
    20. Hélène Laurent, 2021. "Corruption and politicians’ horizon," Economics of Governance, Springer, vol. 22(1), pages 65-91, March.
    21. Guglielmo Barone & Laura Conti & Gaia Narciso & Marco Tonello, 2020. "Auditors conflict of interest: does random selection work?," Trinity Economics Papers tep0820, Trinity College Dublin, Department of Economics.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:osf:socarx:cftvk_v1. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: OSF (email available below). General contact details of provider: https://arabixiv.org .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.