IDEAS home Printed from https://ideas.repec.org/p/wpa/wuwpot/0507001.html
   My bibliography  Save this paper

Not the First Digit! Using Benford’s Law to Detect Fraudulent Scientific Data

Author

Listed:
  • Andreas Diekmann

    (ETH Zurich)

Abstract

Digits in statistical data produced by natural or social processes are often distributed in a manner described by "Benford’s law". Recently, a test against this distribution was used to identify fraudulent accounting data. This test is based on the supposition that real data follow the Benford distribution while fabricated data do not. Is it possible to apply Benford tests to detect fabricated or falsified scientific data as well as fraudulent financial data? We approached this question in two ways. First, we examined the use of the Benford distribution as a standard by checking digit frequencies in published statistical estimates. Second, we conducted experiments in which subjects were asked to fabricate statistical estimates (regression coefficients). These experimental data were scrutinized for possible deviations from the Benford distribution. There were two main findings. First, the digits of the published regression coefficients were approximately Benford distributed. Second, the experimental results yielded new insights into the strengths and weaknesses of Benford tests. At least in the case of regression coefficients, there were indications that checks for digit-preference anomalies should focus less on the first and more on the second and higher-digits.

Suggested Citation

  • Andreas Diekmann, 2005. "Not the First Digit! Using Benford’s Law to Detect Fraudulent Scientific Data," Others 0507001, University Library of Munich, Germany.
  • Handle: RePEc:wpa:wuwpot:0507001
    Note: Type of Document - pdf; pages: 25
    as

    Download full text from publisher

    File URL: https://econwpa.ub.uni-muenchen.de/econ-wp/othr/papers/0507/0507001.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Schraepler, Joerg-Peter & Wagner, Gert G., 2003. "Identification, Characteristics and Impact of Faked Interviews in Surveys: An Analysis by Means of Genuine Fakes in the Raw Data of SOEP," IZA Discussion Papers 969, Institute of Labor Economics (IZA).
    Full references (including those not matched with items on IDEAS)

    Citations

    Blog mentions

    As found by EconAcademics.org, the blog aggregator for Economics research:
    1. Statistical Sleuthing on the Iran Election
      by (author unknown) in The Numbers Guy on 2009-07-01 05:46:58

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Bauer Johannes & Groß Jochen, 2011. "Difficulties Detecting Fraud? The Use of Benford’s Law on Regression Tables," Journal of Economics and Statistics (Jahrbuecher fuer Nationaloekonomie und Statistik), De Gruyter, vol. 231(5-6), pages 733-748, October.
    2. Shikano Susumu & Mack Verena, 2011. "When Does the Second-Digit Benford’s Law-Test Signal an Election Fraud?: Facts or Misleading Test Results," Journal of Economics and Statistics (Jahrbuecher fuer Nationaloekonomie und Statistik), De Gruyter, vol. 231(5-6), pages 719-732, October.
    3. Schräpler Jörg-Peter, 2011. "Benford’s Law as an Instrument for Fraud Detection in Surveys Using the Data of the Socio-Economic Panel (SOEP)," Journal of Economics and Statistics (Jahrbuecher fuer Nationaloekonomie und Statistik), De Gruyter, vol. 231(5-6), pages 685-718, October.
    4. Mr. Jesus R Gonzalez-Garcia & Mr. Gonzalo C Pastor Campos, 2009. "Benford’s Law and Macroeconomic Data Quality," IMF Working Papers 2009/010, International Monetary Fund.
    5. Kundt, Thorben, 2014. "Applying “Benford’s law” to the Crosswise Model: Findings from an online survey on tax evasion," Working Paper 148/2014, Helmut Schmidt University, Hamburg.
    6. Brähler, Gernot & Bensmann, Markus & Emke, Anna-Lena, 2010. "Der Einsatz mathematisch-statistischer Methoden in der digitalen Betriebsprüfung," Ilmenauer Schriften zur Betriebswirtschaftslehre, Technische Universität Ilmenau, Institut für Betriebswirtschaftslehre, volume 4, number 42010.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.

      More about this item

      Keywords

      Benford; Benford's law; falsification of data; fabrication of data;
      All these keywords.

      JEL classification:

      • C - Mathematical and Quantitative Methods

      NEP fields

      This paper has been announced in the following NEP Reports:

      Statistics

      Access and download statistics

      Corrections

      All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wpa:wuwpot:0507001. See general information about how to correct material in RePEc.

      If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

      If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

      If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

      For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: EconWPA (email available below). General contact details of provider: https://econwpa.ub.uni-muenchen.de .

      Please note that corrections may take a couple of weeks to filter through the various RePEc services.

      IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.