IDEAS home Printed from https://ideas.repec.org/p/cpr/ceprdp/17796.html
   My bibliography  Save this paper

Refining Public Policies with Machine Learning: The Case of Tax Auditing

Author

Listed:
  • Battaglini, Marco
  • Guiso, Luigi
  • Lacava, Chiara
  • Miller , Douglas L.
  • Patacchini, Eleonora

Abstract

We study how ML techniques can be used to improve tax auditing efficiency using administrative data without the need of randomized audits. Using Italy’s population data on sole proprietorship tax returns and audits, our new approach addresses the challenge that predictions must be trained on human-selected data. There are substantial margins for raising revenue from audits by improving the selection of taxpayers to audit with ML. Replacing the 10% least promising audits with an equal number selected by our algorithm raises detected tax evasion by as much as 38%, and evasion that is actually paid back by 29%.

Suggested Citation

  • Battaglini, Marco & Guiso, Luigi & Lacava, Chiara & Miller , Douglas L. & Patacchini, Eleonora, 2023. "Refining Public Policies with Machine Learning: The Case of Tax Auditing," CEPR Discussion Papers 17796, C.E.P.R. Discussion Papers.
  • Handle: RePEc:cpr:ceprdp:17796
    as

    Download full text from publisher

    File URL: https://cepr.org/publications/DP17796
    Download Restriction: CEPR Discussion Papers are free to download for our researchers, subscribers and members. If you fall into one of these categories but have trouble downloading our papers, please contact us at subscribers@cepr.org
    ---><---

    As the access to this document is restricted, you may want to look for a different version below or

    for a different version of it.

    Other versions of this item:

    References listed on IDEAS

    as
    1. Marco Battaglini & Luigi Guiso & Chiara Lacava & Eleonora Patacchini, 2019. "Tax Professionals: Tax-Evasion Facilitators or Information Hubs?," NBER Working Papers 25745, National Bureau of Economic Research, Inc.
    2. Monica P Bhatt & Sara B Heller & Max Kapustin & Marianne Bertrand & Christopher Blattman, 2024. "Predicting and Preventing Gun Violence: An Experimental Evaluation of READI Chicago," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 139(1), pages 1-56.
    3. Christopher R. Knittel & Samuel Stolper, 2021. "Machine Learning about Treatment Effect Heterogeneity: The Case of Household Energy Use," AEA Papers and Proceedings, American Economic Association, vol. 111, pages 440-444, May.
    4. William C Boning & Nathaniel Hendren & Ben Sprung-Keyser & Ellen Stuart, 2025. "A Welfare Analysis of Tax Audits Across the Income Distribution," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 140(1), pages 63-112.
    5. Jon Kleinberg & Himabindu Lakkaraju & Jure Leskovec & Jens Ludwig & Sendhil Mullainathan, 2018. "Human Decisions and Machine Predictions," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 133(1), pages 237-293.
    6. Jongbin Jung & Connor Concannon & Ravi Shroff & Sharad Goel & Daniel G. Goldstein, 2020. "Simple rules to guide expert classifications," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 183(3), pages 771-800, June.
    7. M. Hino & E. Benami & N. Brooks, 2018. "Machine learning for environmental monitoring," Nature Sustainability, Nature, vol. 1(10), pages 583-588, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Lange Thomas & Melsom Anne May, 2024. "Tax Compliance among Managers: Evidence from Randomized Audits," Nordic Tax Journal, Sciendo, vol. 2024(1), pages 1-29.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Shroff, Ravi & Vamvourellis, Konstantinos, 2022. "Pretrial release judgments and decision fatigue," LSE Research Online Documents on Economics 117579, London School of Economics and Political Science, LSE Library.
    2. Demetrio Guzzardi & Salvatore Morelli, 2024. "A New Geography of Inequality: Top incomes in Italian Regions and Inner Areas," LEM Papers Series 2024/16, Laboratory of Economics and Management (LEM), Sant'Anna School of Advanced Studies, Pisa, Italy.
    3. Juan Carlos Perdomo, 2023. "The Relative Value of Prediction in Algorithmic Decision Making," Papers 2312.08511, arXiv.org, revised May 2024.
    4. Emilio Carrizosa & Cristina Molero-Río & Dolores Romero Morales, 2021. "Mathematical optimization in classification and regression trees," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 29(1), pages 5-33, April.
    5. Kristian Lum & David B. Dunson & James Johndrow, 2022. "Closer than they appear: A Bayesian perspective on individual‐level heterogeneity in risk assessment," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(2), pages 588-614, April.
    6. Sophie-Charlotte Klose & Johannes Lederer, 2020. "A Pipeline for Variable Selection and False Discovery Rate Control With an Application in Labor Economics," Papers 2006.12296, arXiv.org, revised Jun 2020.
    7. Aliprantis, Dionissi & Martin, Hal & Tauber, Kristen, 2024. "What determines the success of housing mobility programs?," Journal of Housing Economics, Elsevier, vol. 65(C).
    8. Daníelsson, Jón & Macrae, Robert & Uthemann, Andreas, 2022. "Artificial intelligence and systemic risk," Journal of Banking & Finance, Elsevier, vol. 140(C).
    9. Yucheng Yang & Zhong Zheng & Weinan E, 2020. "Interpretable Neural Networks for Panel Data Analysis in Economics," Papers 2010.05311, arXiv.org, revised Nov 2020.
    10. Daniel Carter & Amelia Acker & Dan Sholler, 2021. "Investigative approaches to researching information technology companies," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 72(6), pages 655-666, June.
    11. Zhao, Shuping & Xu, Kai & Wang, Zhao & Liang, Changyong & Lu, Wenxing & Chen, Bo, 2022. "Financial distress prediction by combining sentiment tone features," Economic Modelling, Elsevier, vol. 106(C).
    12. Garbero, Alessandra & Sakos, Grayson & Cerulli, Giovanni, 2023. "Towards data-driven project design: Providing optimal treatment rules for development projects," Socio-Economic Planning Sciences, Elsevier, vol. 89(C).
    13. Maude Lavanchy & Patrick Reichert & Jayanth Narayanan & Krishna Savani, 2023. "Applicants’ Fairness Perceptions of Algorithm-Driven Hiring Procedures," Journal of Business Ethics, Springer, vol. 188(1), pages 125-150, November.
    14. Ivan A Canay & Magne Mogstad & Jack Mount, 2024. "On the Use of Outcome Tests for Detecting Bias in Decision Making," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 91(4), pages 2135-2167.
    15. Bilicka, Katarzyna & Scur, Daniela, 2024. "Organizational capacity and profit shifting," Journal of Public Economics, Elsevier, vol. 238(C).
    16. Md Mohsan Khudri & Kang Keun Rhee & Mohammad Shabbir Hasan & Karar Zunaid Ahsan, 2023. "Predicting nutritional status for women of childbearing age from their economic, health, and demographic features: A supervised machine learning approach," PLOS ONE, Public Library of Science, vol. 18(5), pages 1-31, May.
    17. Frederico M. Bublitz & Arlene Oetomo & Kirti S. Sahu & Amethyst Kuang & Laura X. Fadrique & Pedro E. Velmovitsky & Raphael M. Nobrega & Plinio P. Morita, 2019. "Disruptive Technologies for Environment and Health Research: An Overview of Artificial Intelligence, Blockchain, and Internet of Things," IJERPH, MDPI, vol. 16(20), pages 1-24, October.
    18. Ratzanyel Rincón, 2023. "Quarterly multidimensional poverty estimates in Mexico using machine learning algorithms/Estimaciones trimestrales de pobreza multidimensional en México mediante algoritmos de aprendizaje de máquina," Estudios Económicos, El Colegio de México, Centro de Estudios Económicos, vol. 38(1), pages 3-68.
    19. Klockmann, Victor & von Schenk, Alicia & Villeval, Marie Claire, 2022. "Artificial intelligence, ethics, and intergenerational responsibility," Journal of Economic Behavior & Organization, Elsevier, vol. 203(C), pages 284-317.
    20. Ostheimer, Julia & Chowdhury, Soumitra & Iqbal, Sarfraz, 2021. "An alliance of humans and machines for machine learning: Hybrid intelligent systems and their design principles," Technology in Society, Elsevier, vol. 66(C).

    More about this item

    JEL classification:

    • C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis
    • H26 - Public Economics - - Taxation, Subsidies, and Revenue - - - Tax Evasion and Avoidance

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:cpr:ceprdp:17796. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: https://www.cepr.org .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.