IDEAS home Printed from https://ideas.repec.org/a/gam/jdataj/v11y2026i5p122-d1947435.html

Evaluating the Integrity of LLM-Generated Citations: Prevalence and Risks of Fabricated References in Scientific Literature

Author

Listed:
  • Pablo Picazo-Sanchez

    (School of Information Technology, Halmstad University, 301 18 Halmstad, Sweden)

  • Lara Ortiz-Martin

    (School of Information Technology, Halmstad University, 301 18 Halmstad, Sweden)

Abstract

Large Language Models have become important in our lives, and academia is not agnostic to this trend, offering tools like text rephrasing and summarisation. However, this integration raises significant concerns regarding the integrity of science. In this paper, we investigate hallucinations of LLMs when generating scientific references. Using nine LLMs, we generated a dataset of 74,196 B I B T E X references to quantify and analyse fabricated references, focusing on distinguishing between intrinsic and extrinsic hallucinations. Also, we extracted and analysed 127,063 references from 3541 published papers in 2023 to assess the prevalence of fake bibliographic data. Our manual verification process identified eight instances of fabricated references. While the overall rate is statistically low, the mere existence of fabricated content in the peer-reviewed literature is a critical integrity issue, demonstrating a vulnerability in current academic validation systems. The significance of our finding is not the statistical prevalence but rather the necessity for rigorous, human-validated processes to prevent the injection of spurious citations regardless of their source.

Suggested Citation

  • Pablo Picazo-Sanchez & Lara Ortiz-Martin, 2026. "Evaluating the Integrity of LLM-Generated Citations: Prevalence and Risks of Fabricated References in Scientific Literature," Data, MDPI, vol. 11(5), pages 1-24, May.
  • Handle: RePEc:gam:jdataj:v:11:y:2026:i:5:p:122-:d:1947435
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2306-5729/11/5/122/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2306-5729/11/5/122/
    Download Restriction: no
    ---><---

    More about this item

    Keywords

    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jdataj:v:11:y:2026:i:5:p:122-:d:1947435. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.