IDEAS home Printed from
   My bibliography  Save this paper

Preventing rather than Punishing: An Early Warning Model of Malfeasance in Public Procurement


  • Gallego, J


  • Rivero, G


  • Martínez, J.D.


Is it possible to predict corruption and public inefficiency in public procurement? With the proliferation of e-procurement in the public sector, anti-corruption agencies and watchdog organizations in many countries currently have access to powerful sources of information. These may help anticipate which transactions become faulty and why. In this paper, we discuss the promises and challenges of using machine learning models to predict inefficiency and corruption in public procurement, both from the perspective of researchers and practitioners. We exemplify this procedure using a unique dataset characterizing more than 2 million public contracts in Colombia, and training machine learning models to predict which of them face corruption investigations or implementation inefficiencies. We use different techniques to handle the problem of class imbalance typical of these applications, report the high accuracy of our models, simulate the trade-off between precision and recall in this context, and determine which features contribute the most to the prediction of malfeasance within contracts. Our approach is useful for governments interested in exploiting large administrative datasets to improve provision of public goods and highlights some of the tradeoffs and challenges that they might face throughout this process.

Suggested Citation

  • Gallego, J & Rivero, G & Martínez, J.D., 2018. "Preventing rather than Punishing: An Early Warning Model of Malfeasance in Public Procurement," Documentos de Trabajo 016724, Universidad del Rosario.
  • Handle: RePEc:col:000092:016724

    Download full text from publisher

    File URL:
    Download Restriction: no

    More about this item


    Corruption; Inefficiency; Machine Learning; Public Procurement;

    JEL classification:

    • C53 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Forecasting and Prediction Models; Simulation Methods
    • C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis
    • M42 - Business Administration and Business Economics; Marketing; Accounting; Personnel Economics - - Accounting - - - Auditing
    • O12 - Economic Development, Innovation, Technological Change, and Growth - - Economic Development - - - Microeconomic Analyses of Economic Development

    NEP fields

    This paper has been announced in the following NEP Reports:


    Access and download statistics


    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:col:000092:016724. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Facultad de Economía). General contact details of provider: .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.