IDEAS home Printed from https://ideas.repec.org/h/bis/bisifc/53-11.html

Quality checks on granular banking data: an experimental approach based on machine learning

In: Micro data for the macro world

Author

Listed:
  • Fabio Zambuto

Abstract

We propose a new methodology, based on machine learning algorithms, for the automatic detection of outliers in the data that banks report to the Bank of Italy. Our analysis focuses on granular data gathered within the statistical data collection on payment services, in which the lack of strong ex ante deterministic relationships among the collected variables makes standard diagnostic approaches less powerful. Quantile regression forests are used to derive a region of acceptance for the targeted information. For a given level of probability, plausibility thresholds are obtained on the basis of individual bank characteristics and are automatically updated as new data are reported. The approach was applied to validate semi-annual data on debit card issuance received from reporting agents between December 2016 and June 2018. The algorithm was trained with data reported in previous periods and tested by cross-checking the identified outliers with the reporting agents. The method made it possible to detect, with a high level of precision in term of false positives, new outliers that had not been detected using the standard procedures.
(This abstract was borrowed from another version of this item.)

Suggested Citation

  • Fabio Zambuto, 2021. "Quality checks on granular banking data: an experimental approach based on machine learning," IFC Bulletins chapters, in: Bank for International Settlements (ed.), Micro data for the macro world, volume 53, Bank for International Settlements.
  • Handle: RePEc:bis:bisifc:53-11
    as

    Download full text from publisher

    File URL: https://www.bis.org/ifc/publ/ifcb53_11.pdf
    Download Restriction: no
    ---><---

    Other versions of this item:

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. is not listed on IDEAS
    2. Davide Nicola Continanza & Andrea del Monaco & Marco di Lucido & Daniele Figoli & Pasquale Maddaloni & Filippo Quarta & Giuseppe Turturiello, 2023. "Stacking machine learning models for anomaly detection: comparing AnaCredit to other banking data sets," IFC Bulletins chapters, in: Bank for International Settlements (ed.), Data science in central banking: applications and tools, volume 59, Bank for International Settlements.
    3. Vittoria La Serra & Emiliano Svezia, 2024. "A supervised record linkage approach for anomaly detection in insurance assets granular data," Quality & Quantity: International Journal of Methodology, Springer, vol. 58(5), pages 4181-4205, October.
    4. Massimo Casa & Laura Graziani Palmieri & Laura Mellone & Francesca Monacelli, 2022. "The integrated approach adopted by Bank of Italy in the collection and production of credit and financial data," Questioni di Economia e Finanza (Occasional Papers) 667, Bank of Italy, Economic Research and International Relations Area.
    5. Fabio Zambuto & Simona Arcuti & Roberto Sabatini & Daniele Zambuto, 2021. "Application of classification algorithms for the assessment of confirmation to quality remarks," Questioni di Economia e Finanza (Occasional Papers) 631, Bank of Italy, Economic Research and International Relations Area.
    6. Canio Benedetto & Sara Crestini & Alessandro de Gregorio & Marco de Leonardis & Andrea del Monaco & Daniele Gulino & Paolo Massaro & Francesca Monacelli & Lorenzo Rubeo, 2025. "Applying artificial intelligence to support regulatory reporting management: the experience at Banca d'Italia," Questioni di Economia e Finanza (Occasional Papers) 927, Bank of Italy, Economic Research and International Relations Area.
    7. Francesco Cusano & Giuseppe Marinelli & Stefano Piermattei, 2021. "Learning from revisions: a tool for detecting potential errors in banks' balance sheet statistical reporting," Questioni di Economia e Finanza (Occasional Papers) 611, Bank of Italy, Economic Research and International Relations Area.
    8. Francesco Cusano & Giuseppe Marinelli & Stefano Piermattei, 2022. "Learning from revisions: an algorithm to detect errors in banks’ balance sheet statistical reporting," Quality & Quantity: International Journal of Methodology, Springer, vol. 56(6), pages 4025-4059, December.

    More about this item

    JEL classification:

    • C18 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Methodolical Issues: General
    • C81 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Methodology for Collecting, Estimating, and Organizing Microeconomic Data; Data Access
    • G21 - Financial Economics - - Financial Institutions and Services - - - Banks; Other Depository Institutions; Micro Finance Institutions; Mortgages

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bis:bisifc:53-11. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Martin Fessler (email available below). General contact details of provider: https://edirc.repec.org/data/bisssch.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.