IDEAS home Printed from https://ideas.repec.org/a/hpe/journl/y2021v239i4p127-157.html
   My bibliography  Save this article

Segmentation of Potential Fraud Taxpayers and Characterization in Personal Income Tax Using Data Mining Techniques

Author

Listed:
  • Camino González Vasco

    (Instituto de Estudios Fiscales)

  • María Jesús Delgado Rodríguez

    (Universidad Rey Juan Carlos)

  • Sonia de Lucas Santos

    (Universidad Autónoma de Madrid)

Abstract

This paper proposes an analytical framework that combines dimension reduction and data mining techniques to obtain a sample segmentation according to potential fraud probability. In this regard, the purpose of this study is twofold. Firstly, it attempts to determine tax benefits that are more likely to be used by potential fraud taxpayers by means of investigating the Personal Income Tax structure. Secondly, it aims at characterizing through socioeconomic variables the segment profiles of potential fraud taxpayer to offer an audit selection strategy for improving tax compliance and improve tax design. An application to the annual Spanish Personal Income Tax sample designed by the Institute for Fiscal Studies is provided. Results obtained confirm that the combination of data mining techniques proposed offers valuable information to contribute to the study of tax fraud.

Suggested Citation

  • Camino González Vasco & María Jesús Delgado Rodríguez & Sonia de Lucas Santos, 2021. "Segmentation of Potential Fraud Taxpayers and Characterization in Personal Income Tax Using Data Mining Techniques," Hacienda Pública Española / Review of Public Economics, IEF, vol. 239(4), pages 127-157, November.
  • Handle: RePEc:hpe:journl:y:2021:v:239:i:4:p:127-157
    as

    Download full text from publisher

    File URL: https://hpe-rpe.org/wp-admin/admin-ajax.php?juwpfisadmin=false&action=wpfd&task=file.download&wpfd_category_id=213&wpfd_file_id=4908&token=0e0c8cbbd546e97c64f7fc04b3e6f0f5&preview=1
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Joel Slemrod, 2019. "Tax Compliance and Enforcement," Journal of Economic Literature, American Economic Association, vol. 57(4), pages 904-954, December.
    2. Sara Torregrosa, 2015. "Bypassing progressive taxation: fraud and base erosion in the Spanish income tax (1970-2001)," Working Papers 2015/31, Institut d'Economia de Barcelona (IEB).
    3. Fox, William F. & Luna, LeAnn & Schaur, Georg, 2014. "Destination taxation and evasion: Evidence from U.S. inter-state commodity flows," Journal of Accounting and Economics, Elsevier, vol. 57(1), pages 43-57.
    4. G. V. Kass, 1980. "An Exploratory Technique for Investigating Large Quantities of Categorical Data," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 29(2), pages 119-127, June.
    5. David López-Rodríguez & Cristina García Ciria, 2018. "Estructura impositiva de España en el contexto de la Unión Europea," Occasional Papers 1810, Banco de España.
    6. César Pérez López & María Jesús Delgado Rodríguez & Sonia de Lucas Santos, 2019. "Tax Fraud Detection through Neural Networks: An Application Using a Sample of Personal Income Taxpayers," Future Internet, MDPI, vol. 11(4), pages 1-13, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. César Pérez López & María Jesús Delgado Rodríguez & Sonia de Lucas Santos, 2023. "Modelización de los factores que afectan al fraude fiscal con técnicas de minería de datos: aplicación al Impuesto de la Renta en España," Hacienda Pública Española / Review of Public Economics, IEF, vol. 246(3), pages 137-164, September.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Fox, William F. & Hargaden, Enda Patrick & Luna, LeAnn, 2022. "Statutory incidence and sales tax compliance: Evidence from Wayfair," Journal of Public Economics, Elsevier, vol. 213(C).
    2. Strobl, Carolin & Boulesteix, Anne-Laure & Augustin, Thomas, 2007. "Unbiased split selection for classification trees based on the Gini Index," Computational Statistics & Data Analysis, Elsevier, vol. 52(1), pages 483-501, September.
    3. Puklavec, Žiga & Kogler, Christoph & Stavrova, Olga & Zeelenberg, Marcel, 2023. "What we tweet about when we tweet about taxes: A topic modelling approach," Journal of Economic Behavior & Organization, Elsevier, vol. 212(C), pages 1242-1254.
    4. Hache, Emmanuel & Leboullenger, Déborah & Mignon, Valérie, 2017. "Beyond average energy consumption in the French residential housing market: A household classification approach," Energy Policy, Elsevier, vol. 107(C), pages 82-95.
    5. Ghosh, Atish R. & Qureshi, Mahvash S. & Kim, Jun Il & Zalduendo, Juan, 2014. "Surges," Journal of International Economics, Elsevier, vol. 92(2), pages 266-285.
      • Mahvash S Qureshi & Mr. Atish R. Ghosh & Mr. Juan Zalduendo & Mr. Jun I Kim, 2012. "Surges," IMF Working Papers 2012/022, International Monetary Fund.
    6. Arun Advani, 2022. "Who does and doesn't pay taxes?," Fiscal Studies, John Wiley & Sons, vol. 43(1), pages 5-22, March.
    7. Matthias Kasper & James Alm, 2022. "Does the Bomb-crater Effect Really Exist? Evidence from the Laboratory," FinanzArchiv: Public Finance Analysis, Mohr Siebeck, Tübingen, vol. 78(1-2), pages 87-111.
    8. Tomàs Aluja-Banet & Eduard Nafria, 2003. "Stability and scalability in decision trees," Computational Statistics, Springer, vol. 18(3), pages 505-520, September.
    9. I. Albarrán & P. Alonso-González & J. M. Marin, 2017. "Some criticism to a general model in Solvency II: an explanation from a clustering point of view," Empirical Economics, Springer, vol. 52(4), pages 1289-1308, June.
    10. Schwartz, Ira M. & York, Peter & Nowakowski-Sims, Eva & Ramos-Hernandez, Ana, 2017. "Predictive and prescriptive analytics, machine learning and child welfare risk assessment: The Broward County experience," Children and Youth Services Review, Elsevier, vol. 81(C), pages 309-320.
    11. Yousaf Muhammad & Dey Sandeep Kumar, 2022. "Best proxy to determine firm performance using financial ratios: A CHAID approach," Review of Economic Perspectives, Sciendo, vol. 22(3), pages 219-239, September.
    12. Ralf Elsner & Manfred Krafft & Arnd Huchzermeier, 2003. "Optimizing Rhenania's Mail-Order Business Through Dynamic Multilevel Modeling (DMLM)," Interfaces, INFORMS, vol. 33(1), pages 50-66, February.
    13. David R. Agrawal & Ronald B. Davies & Sara LaLumia & Nadine Riedel & Kimberley Scharf, 2021. "A snapshot of public finance research from immediately prior to the pandemic: IIPF 2020," International Tax and Public Finance, Springer;International Institute of Public Finance, vol. 28(5), pages 1276-1297, October.
    14. Gillitzer, Christian & Sinning, Mathias, 2020. "Nudging businesses to pay their taxes: Does timing matter?," Journal of Economic Behavior & Organization, Elsevier, vol. 169(C), pages 284-300.
    15. Serrano-Cinca, Carlos & Gutiérrez-Nieto, Begoña & Bernate-Valbuena, Martha, 2019. "The use of accounting anomalies indicators to predict business failure," European Management Journal, Elsevier, vol. 37(3), pages 353-375.
    16. Bíró, Anikó & Prinz, Dániel & Sándor, László, 2022. "The minimum wage, informal pay, and tax enforcement," Journal of Public Economics, Elsevier, vol. 215(C).
    17. Mascagni, Giulia & Lees, Adrienne, 2021. "Using Administrative Data to Assess the Impact of the Pandemic in Low-Income Countries: An Application with VAT Data in Rwanda," Working Papers 16468, Institute of Development Studies, International Centre for Tax and Development.
    18. James Alm, 2024. "Tax Compliance, Technology, Trust, and Inequality in a Post-Pandemic World," Working Papers 2404, Tulane University, Department of Economics.
    19. C. Yiwei Zhang & Jeffrey Hemmeter & Judd B. Kessler & Robert D. Metcalfe & Robert Weathers, 2023. "Nudging Timely Wage Reporting: Field Experimental Evidence from the U.S. Supplemental Security Income Program," Management Science, INFORMS, vol. 69(3), pages 1341-1353, March.
    20. Clara Martínez Toledano, 2020. "House Price Cycles, Wealth Inequality and Portfolio Reshuffling," Working Papers hal-02876979, HAL.

    More about this item

    Keywords

    Personal income tax; Tax compliance; Data mining techniques; Multilayer perceptron; Decision trees; Fiscal fraud detection; Tax evaluation.;
    All these keywords.

    JEL classification:

    • H24 - Public Economics - - Taxation, Subsidies, and Revenue - - - Personal Income and Other Nonbusiness Taxes and Subsidies
    • C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis
    • C38 - Mathematical and Quantitative Methods - - Multiple or Simultaneous Equation Models; Multiple Variables - - - Classification Methdos; Cluster Analysis; Principal Components; Factor Analysis

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:hpe:journl:y:2021:v:239:i:4:p:127-157. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Miguel Gómez de Antonio (email available below). General contact details of provider: https://edirc.repec.org/data/iefgves.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.