IDEAS home Printed from https://ideas.repec.org/p/nbr/nberwo/30777.html
   My bibliography  Save this paper

Refining Public Policies with Machine Learning: The Case of Tax Auditing

Author

Listed:
  • Marco Battaglini
  • Luigi Guiso
  • Chiara Lacava
  • Douglas L. Miller
  • Eleonora Patacchini

Abstract

We study the extent to which ML techniques can be used to improve tax auditing efficiency using administrative data, without the need of randomized audits. Using Italy's population data on sole proprietorship tax returns, audits and their outcome, we develop a new approach to address the so called selective labels problem - the fact that a ML algorithm must necessarily be trained on endogenously selected data. We document the existence of substantial margins for raising revenue from audits by improving the selection of taxpayers to audit with ML. Replacing the 10% least productive audits with an equal number of taxpayers selected by our trained algorithm raises detected tax evasion by as much as 38%, and evasion that is actually payed back by 29%.

Suggested Citation

  • Marco Battaglini & Luigi Guiso & Chiara Lacava & Douglas L. Miller & Eleonora Patacchini, 2022. "Refining Public Policies with Machine Learning: The Case of Tax Auditing," NBER Working Papers 30777, National Bureau of Economic Research, Inc.
  • Handle: RePEc:nbr:nberwo:30777
    Note: PE
    as

    Download full text from publisher

    File URL: http://www.nber.org/papers/w30777.pdf
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Marco Battaglini & Luigi Guiso & Chiara Lacava & Eleonora Patacchini, 2019. "Tax Professionals: Tax-Evasion Facilitators or Information Hubs?," NBER Working Papers 25745, National Bureau of Economic Research, Inc.
    2. Christopher R. Knittel & Samuel Stolper, 2021. "Machine Learning about Treatment Effect Heterogeneity: The Case of Household Energy Use," AEA Papers and Proceedings, American Economic Association, vol. 111, pages 440-444, May.
    3. William C Boning & Nathaniel Hendren & Ben Sprung-Keyser & Ellen Stuart, 2025. "A Welfare Analysis of Tax Audits Across the Income Distribution," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 140(1), pages 63-112.
    4. Jongbin Jung & Connor Concannon & Ravi Shroff & Sharad Goel & Daniel G. Goldstein, 2020. "Simple rules to guide expert classifications," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 183(3), pages 771-800, June.
    5. M. Hino & E. Benami & N. Brooks, 2018. "Machine learning for environmental monitoring," Nature Sustainability, Nature, vol. 1(10), pages 583-588, October.
    6. Monica P Bhatt & Sara B Heller & Max Kapustin & Marianne Bertrand & Christopher Blattman, 2024. "Predicting and Preventing Gun Violence: An Experimental Evaluation of READI Chicago," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 139(1), pages 1-56.
    7. Jon Kleinberg & Himabindu Lakkaraju & Jure Leskovec & Jens Ludwig & Sendhil Mullainathan, 2018. "Human Decisions and Machine Predictions," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 133(1), pages 237-293.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Lange Thomas & Melsom Anne May, 2024. "Tax Compliance among Managers: Evidence from Randomized Audits," Nordic Tax Journal, Sciendo, vol. 2024(1), pages 1-29.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Emilio Carrizosa & Cristina Molero-Río & Dolores Romero Morales, 2021. "Mathematical optimization in classification and regression trees," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 29(1), pages 5-33, April.
    2. Shroff, Ravi & Vamvourellis, Konstantinos, 2022. "Pretrial release judgments and decision fatigue," LSE Research Online Documents on Economics 117579, London School of Economics and Political Science, LSE Library.
    3. Demetrio Guzzardi & Salvatore Morelli, 2024. "A New Geography of Inequality: Top incomes in Italian Regions and Inner Areas," LEM Papers Series 2024/16, Laboratory of Economics and Management (LEM), Sant'Anna School of Advanced Studies, Pisa, Italy.
    4. Juan Carlos Perdomo, 2023. "The Relative Value of Prediction in Algorithmic Decision Making," Papers 2312.08511, arXiv.org, revised May 2024.
    5. Kristian Lum & David B. Dunson & James Johndrow, 2022. "Closer than they appear: A Bayesian perspective on individual‐level heterogeneity in risk assessment," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(2), pages 588-614, April.
    6. Aliprantis, Dionissi & Martin, Hal & Tauber, Kristen, 2024. "What determines the success of housing mobility programs?," Journal of Housing Economics, Elsevier, vol. 65(C).
    7. Daníelsson, Jón & Macrae, Robert & Uthemann, Andreas, 2022. "Artificial intelligence and systemic risk," Journal of Banking & Finance, Elsevier, vol. 140(C).
    8. Yucheng Yang & Zhong Zheng & Weinan E, 2020. "Interpretable Neural Networks for Panel Data Analysis in Economics," Papers 2010.05311, arXiv.org, revised Nov 2020.
    9. Daniel Carter & Amelia Acker & Dan Sholler, 2021. "Investigative approaches to researching information technology companies," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 72(6), pages 655-666, June.
    10. Ivan A Canay & Magne Mogstad & Jack Mount, 2024. "On the Use of Outcome Tests for Detecting Bias in Decision Making," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 91(4), pages 2135-2167.
    11. Md Mohsan Khudri & Kang Keun Rhee & Mohammad Shabbir Hasan & Karar Zunaid Ahsan, 2023. "Predicting nutritional status for women of childbearing age from their economic, health, and demographic features: A supervised machine learning approach," PLOS ONE, Public Library of Science, vol. 18(5), pages 1-31, May.
    12. Ratzanyel Rincón, 2023. "Quarterly multidimensional poverty estimates in Mexico using machine learning algorithms/Estimaciones trimestrales de pobreza multidimensional en México mediante algoritmos de aprendizaje de máquina," Estudios Económicos, El Colegio de México, Centro de Estudios Económicos, vol. 38(1), pages 3-68.
    13. Klockmann, Victor & von Schenk, Alicia & Villeval, Marie Claire, 2022. "Artificial intelligence, ethics, and intergenerational responsibility," Journal of Economic Behavior & Organization, Elsevier, vol. 203(C), pages 284-317.
    14. Shan Huang & Michael Allan Ribers & Hannes Ullrich, 2021. "The Value of Data for Prediction Policy Problems: Evidence from Antibiotic Prescribing," Discussion Papers of DIW Berlin 1939, DIW Berlin, German Institute for Economic Research.
    15. Wang, Weilong & Wang, Jianlong & Wu, Haitao, 2024. "The impact of energy-consuming rights trading on green total factor productivity in the context of digital economy: Evidence from listed firms in China," Energy Economics, Elsevier, vol. 131(C).
    16. Columbus, Simon & Feld, Lars P. & Kasper, Matthias & Rablen, Matthew D., 2023. "Behavioural Responses to Unfair Institutions: Experimental Evidence on Rule Compliance, Norm Polarisation, and Trust," IZA Discussion Papers 16346, Institute of Labor Economics (IZA).
    17. Daniel Bjorkegren & Joshua E. Blumenstock & Samsun Knight, 2020. "Manipulation-Proof Machine Learning," Papers 2004.03865, arXiv.org.
    18. Anthony Niblett, 2018. "Regulatory Reform in Ontario: Machine Learning and Regulation," C.D. Howe Institute Commentary, C.D. Howe Institute, issue 507, March.
    19. Ekaterina Jussupow & Kai Spohrer & Armin Heinzl & Joshua Gawlitza, 2021. "Augmenting Medical Diagnosis Decisions? An Investigation into Physicians’ Decision-Making Process with Artificial Intelligence," Information Systems Research, INFORMS, vol. 32(3), pages 713-735, September.
    20. Delogu, Marco & Lagravinese, Raffaele & Paolini, Dimitri & Resce, Giuliano, 2024. "Predicting dropout from higher education: Evidence from Italy," Economic Modelling, Elsevier, vol. 130(C).

    More about this item

    JEL classification:

    • H2 - Public Economics - - Taxation, Subsidies, and Revenue
    • H20 - Public Economics - - Taxation, Subsidies, and Revenue - - - General
    • H26 - Public Economics - - Taxation, Subsidies, and Revenue - - - Tax Evasion and Avoidance

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nbr:nberwo:30777. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: https://edirc.repec.org/data/nberrus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.