IDEAS home Printed from https://ideas.repec.org/p/mib/wpaper/436.html
   My bibliography  Save this paper

Optimizing Tax Administration Policies with Machine Learning

Author

Listed:
  • Pietro Battiston
  • Simona Gamba
  • Alessandro Santoro

Abstract

Tax authorities around the world are increasingly employing data mining and machine learning algorithms to predict individual behaviours. Although the traditional literature on optimal tax administration provides useful tools for ex-post evaluation of policies, it disregards the problem of which taxpayers to target. This study identifies and characterises a loss function that assigns a social cost to any prediction-based policy. We define such measure as the difference between the social welfare of a given policy and that of an ideal policy unaffected by prediction errors. We show how this loss function shares a relationship with the receiver operating characteristic curve, a standard statistical tool used to evaluate prediction performance. Subsequently, we apply our measure to predict inaccurate tax returns issued by self-employed and sole proprietorships in Italy. In our application, a random forest model provides the best prediction: we show how it can be interpreted using measures of variable importance developed in the machine learning literature.

Suggested Citation

  • Pietro Battiston & Simona Gamba & Alessandro Santoro, 2020. "Optimizing Tax Administration Policies with Machine Learning," Working Papers 436, University of Milano-Bicocca, Department of Economics, revised Mar 2020.
  • Handle: RePEc:mib:wpaper:436
    as

    Download full text from publisher

    File URL: http://repec.dems.unimib.it/repec/pdf/mibwpaper436.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Keen, Michael & Slemrod, Joel, 2017. "Optimal tax administration," Journal of Public Economics, Elsevier, vol. 152(C), pages 133-142.
    2. Miguel Almunia & David Lopez-Rodriguez, 2018. "Under the Radar: The Effects of Monitoring Firms on Tax Compliance," American Economic Journal: Economic Policy, American Economic Association, vol. 10(1), pages 1-38, February.
    3. Sebastian Beer & Matthias Kasper & Erich Kirchler & Brian Erard, 0. "Do Audits Deter or Provoke Future Tax Noncompliance? Evidence on Self-Employed Taxpayers," CESifo Economic Studies, CESifo Group, vol. 66(3), pages 248-264.
    4. Hal R. Varian, 2014. "Big Data: New Tricks for Econometrics," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 3-28, Spring.
    5. Jon Kleinberg & Jens Ludwig & Sendhil Mullainathan & Ziad Obermeyer, 2015. "Prediction Policy Problems," American Economic Review, American Economic Association, vol. 105(5), pages 491-495, May.
    6. Jonah E. Rockoff & Brian A. Jacob & Thomas J. Kane & Douglas O. Staiger, 2011. "Can You Recognize an Effective Teacher When You Recruit One?," Education Finance and Policy, MIT Press, vol. 6(1), pages 43-74, January.
    7. Sendhil Mullainathan & Jann Spiess, 2017. "Machine Learning: An Applied Econometric Approach," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 87-106, Spring.
    8. Dana Chandler & Steven D. Levitt & John A. List, 2011. "Predicting and Preventing Shootings among At-Risk Youth," American Economic Review, American Economic Association, vol. 101(3), pages 288-292, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Elliott Ash & Sergio Galletta & Tommaso Giommoni, 2021. "A Machine Learning Approach to Analyze and Support Anti-Corruption Policy," CESifo Working Paper Series 9015, CESifo.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Battiston, Pietro & Gamba, Simona & Santoro, Alessandro, 2024. "Machine learning and the optimization of prediction-based policies," Technological Forecasting and Social Change, Elsevier, vol. 199(C).
    2. Andini, Monica & Ciani, Emanuele & de Blasio, Guido & D'Ignazio, Alessio & Salvestrini, Viola, 2018. "Targeting with machine learning: An application to a tax rebate program in Italy," Journal of Economic Behavior & Organization, Elsevier, vol. 156(C), pages 86-102.
    3. de Blasio, Guido & D'Ignazio, Alessio & Letta, Marco, 2022. "Gotham city. Predicting ‘corrupted’ municipalities with machine learning," Technological Forecasting and Social Change, Elsevier, vol. 184(C).
    4. Monica Andini & Emanuele Ciani & Guido de Blasio & Alessio D'Ignazio & Viola Salvestrini, 2017. "Targeting policy-compliers with machine learning: an application to a tax rebate programme in Italy," Temi di discussione (Economic working papers) 1158, Bank of Italy, Economic Research and International Relations Area.
    5. Guido de Blasio & Alessio D'Ignazio & Marco Letta, 2020. "Predicting Corruption Crimes with Machine Learning. A Study for the Italian Municipalities," Working Papers 16/20, Sapienza University of Rome, DISS.
    6. Erik Heilmann & Janosch Henze & Heike Wetzel, 2021. "Machine learning in energy forecasts with an application to high frequency electricity consumption data," MAGKS Papers on Economics 202135, Philipps-Universität Marburg, Faculty of Business Administration and Economics, Department of Economics (Volkswirtschaftliche Abteilung).
    7. Filmer,Deon P. & Nahata,Vatsal & Sabarwal,Shwetlena, 2021. "Preparation, Practice, and Beliefs : A Machine Learning Approach to Understanding Teacher Effectiveness," Policy Research Working Paper Series 9847, The World Bank.
    8. Francesco Decarolis & Cristina Giorgiantonio, 2020. "Corruption red flags in public procurement: new evidence from Italian calls for tenders," Questioni di Economia e Finanza (Occasional Papers) 544, Bank of Italy, Economic Research and International Relations Area.
    9. Emanuel Kohlscheen, 2022. "Quantifying the Role of Interest Rates, the Dollar and Covid in Oil Prices," Papers 2208.14254, arXiv.org, revised Oct 2022.
    10. Erokhin, Dmitry & Zagler, Martin, 2024. "Who will sign a double tax treaty next? A prediction based on economic determinants and machine learning algorithms," Economic Modelling, Elsevier, vol. 139(C).
    11. Alessandra Garbero & Marco Letta, 2022. "Predicting household resilience with machine learning: preliminary cross-country tests," Empirical Economics, Springer, vol. 63(4), pages 2057-2070, October.
    12. Cerqua, Augusto & Letta, Marco, 2022. "Local inequalities of the COVID-19 crisis," Regional Science and Urban Economics, Elsevier, vol. 92(C).
    13. Emanuel Kohlscheen, 2024. "Forecasting oil prices with random forests," Empirical Economics, Springer, vol. 66(2), pages 927-943, February.
    14. Akash Malhotra, 2021. "A hybrid econometric–machine learning approach for relative importance analysis: prioritizing food policy," Eurasian Economic Review, Springer;Eurasia Business and Economics Society, vol. 11(3), pages 549-581, September.
    15. Potnuru Kishen Suraj & Ankesh Gupta & Makkunda Sharma & Sourabh Bikas Paul & Subhashis Banerjee, 2017. "On monitoring development indicators using high resolution satellite images," Papers 1712.02282, arXiv.org, revised Jun 2018.
    16. Eberhartinger, Eva & Safaei, Reyhaneh & Sureth, Caren & Wu, Yuchen, 2021. "Are risk-based tax audit stretegies rewarded? An analysis of corporate tax avoidance," arqus Discussion Papers in Quantitative Tax Research 267, arqus - Arbeitskreis Quantitative Steuerlehre.
    17. Sophie-Charlotte Klose & Johannes Lederer, 2020. "A Pipeline for Variable Selection and False Discovery Rate Control With an Application in Labor Economics," Papers 2006.12296, arXiv.org, revised Jun 2020.
    18. Rama K. Malladi, 2024. "Benchmark Analysis of Machine Learning Methods to Forecast the U.S. Annual Inflation Rate During a High-Decile Inflation Period," Computational Economics, Springer;Society for Computational Economics, vol. 64(1), pages 335-375, July.
    19. Giovanni Di Franco & Michele Santurro, 2021. "Machine learning, artificial neural networks and social research," Quality & Quantity: International Journal of Methodology, Springer, vol. 55(3), pages 1007-1025, June.
    20. Sean Tanner & Jenna Terrell & Emily Vislosky & Jonathan Gellar & Brian Gill, "undated". "Predicting Early Fall Student Enrollment in the School District of Philadelphia," Mathematica Policy Research Reports 63a18bf538bd41f98d72ff91d, Mathematica Policy Research.

    More about this item

    Keywords

    policy prediction problems; tax behaviour; big data; machine learning;
    All these keywords.

    JEL classification:

    • H26 - Public Economics - - Taxation, Subsidies, and Revenue - - - Tax Evasion and Avoidance
    • H32 - Public Economics - - Fiscal Policies and Behavior of Economic Agents - - - Firm
    • C53 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Forecasting and Prediction Models; Simulation Methods

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:mib:wpaper:436. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Matteo Pelagatti (email available below). General contact details of provider: https://edirc.repec.org/data/dpmibit.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.