IDEAS home Printed from https://ideas.repec.org/a/nas/journl/v120y2023pe2213697120.html
   My bibliography  Save this article

Predicting substantive biomedical citations without full text

Author

Listed:
  • Travis A. Hoppe

    (a Office of the Director, National Center for Health Statistics, Centers for Disease Control and Prevention , Hyattsville , MD 20782)

  • Salsabil Arabi

    (b Information School, School of Computer, Data, and Information Sciences, College of Letters and Science, University of Wisconsin-Madison , Madison , WI 53706)

  • B. Ian Hutchins

    (b Information School, School of Computer, Data, and Information Sciences, College of Letters and Science, University of Wisconsin-Madison , Madison , WI 53706)

Abstract

Insights from biomedical citation networks can be used to identify promising avenues for accelerating research and its downstream bench-to-bedside translation. Citation analysis generally assumes that each citation documents substantive knowledge transfer that informed the conception, design, or execution of the main experiments. Citations may exist for other reasons. In this paper, we take advantage of late-stage citations added during peer review because these are less likely to represent substantive knowledge flow. Using a large, comprehensive feature set of open access data, we train a predictive model to identify late-stage citations. The model relies only on the title, abstract, and citations to previous articles but not the full-text or future citations patterns, making it suitable for publications as soon as they are released, or those behind a paywall (the vast majority). We find that high prediction scores identify late-stage citations that were likely added during the peer review process as well as those more likely to be rhetorical, such as journal self-citations added during review. Our model conversely gives low prediction scores to early-stage citations and citation classes that are known to represent substantive knowledge transfer. Using this model, we find that US federally funded biomedical research publications represent 30% of the predicted early-stage (and more likely to be substantive) knowledge transfer from basic studies to clinical research, even though these comprise only 10% of the literature. This is a threefold overrepresentation in this important type of knowledge flow.

Suggested Citation

  • Travis A. Hoppe & Salsabil Arabi & B. Ian Hutchins, 2023. "Predicting substantive biomedical citations without full text," Proceedings of the National Academy of Sciences, Proceedings of the National Academy of Sciences, vol. 120(30), pages 2213697120-, July.
  • Handle: RePEc:nas:journl:v:120:y:2023:p:e2213697120
    DOI: 10.1073/pnas.2213697120
    as

    Download full text from publisher

    File URL: https://doi.org/10.1073/pnas.2213697120
    Download Restriction: no

    File URL: https://libkey.io/10.1073/pnas.2213697120?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nas:journl:v:120:y:2023:p:e2213697120. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Eric Cain (email available below). General contact details of provider: http://www.pnas.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.