IDEAS home Printed from https://ideas.repec.org/a/spr/drugsa/v43y2020i1d10.1007_s40264-019-00872-9.html
   My bibliography  Save this article

Complementing Observational Signals with Literature-Derived Distributed Representations for Post-Marketing Drug Surveillance

Author

Listed:
  • Justin Mower

    (Rice University)

  • Trevor Cohen

    (University of Washington, Biomedical Informatics and Medical Education)

  • Devika Subramanian

    (Rice University)

Abstract

Introduction As a result of the well documented limitations of data collected by spontaneous reporting systems (SRS), such as bias and under-reporting, a number of authors have evaluated the utility of other data sources for the purpose of pharmacovigilance, including the biomedical literature. Previous work has demonstrated the utility of literature-derived distributed representations (concept embeddings) with machine learning for the purpose of drug side-effect prediction. In terms of data sources, these methods are complementary, observing drug safety from two different perspectives (knowledge extracted from the literature and statistics from SRS data). However, the combined utility of these pharmacovigilance methods has yet to be evaluated. Objective This research investigates the utility of directly or indirectly combining an observational signal from SRS with literature-derived distributed representations into a single feature vector or in an ensemble approach for downstream machine learning (logistic regression). Methods Leveraging a recently developed representation scheme, concept embeddings were generated from relational connections extracted from the literature and composed to represent drug and associated adverse reactions, as defined by two reference standards of positive (likely causal) and negative (no causal evidence) pairs. Embeddings were presented with and without common measures of observational signal from SRS sources to logistic regressors, and performance was evaluated with the receiver operating characteristic (ROC) area under the curve (AUC) metric. Results ROC AUC performance with these composite models improves up to ≈ 20% over SRS-based disproportionality metrics alone and exceeds the best prior results reported in the literature when models leverage both sources of information. Conclusions Results from this study support the hypothesis that knowledge extracted from the literature can enhance the performance of SRS-based methods (and vice versa). Across reference sets, using literature and SRS information together performed better than using either source alone, providing strong support for the complementary nature of these approaches to post-marketing drug surveillance.

Suggested Citation

  • Justin Mower & Trevor Cohen & Devika Subramanian, 2020. "Complementing Observational Signals with Literature-Derived Distributed Representations for Post-Marketing Drug Surveillance," Drug Safety, Springer, vol. 43(1), pages 67-77, January.
  • Handle: RePEc:spr:drugsa:v:43:y:2020:i:1:d:10.1007_s40264-019-00872-9
    DOI: 10.1007/s40264-019-00872-9
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s40264-019-00872-9
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s40264-019-00872-9?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Michael D. Gordon & Susan Dumais, 1998. "Using latent semantic indexing for literature based discovery," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 49(8), pages 674-685.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Likeng Liang & Jifa Hu & Gang Sun & Na Hong & Ge Wu & Yuejun He & Yong Li & Tianyong Hao & Li Liu & Mengchun Gong, 2022. "Artificial Intelligence-Based Pharmacovigilance in the Setting of Limited Resources," Drug Safety, Springer, vol. 45(5), pages 511-519, May.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jose M. Vicente-Gomila, 2014. "The contribution of syntactic–semantic approach to the search for complementary literatures for scientific or technical discovery," Scientometrics, Springer;Akadémiai Kiadó, vol. 100(3), pages 659-673, September.
    2. Andrej Kastrin & Dimitar Hristovski, 2021. "Scientometric analysis and knowledge mapping of literature-based discovery (1986–2020)," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(2), pages 1415-1451, February.
    3. Choudhury, Nazim & Faisal, Fahim & Khushi, Matloob, 2020. "Mining Temporal Evolution of Knowledge Graphs and Genealogical Features for Literature-based Discovery Prediction," Journal of Informetrics, Elsevier, vol. 14(3).
    4. Chaker Jebari & Enrique Herrera-Viedma & Manuel Jesus Cobo, 2021. "The use of citation context to detect the evolution of research topics: a large-scale analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(4), pages 2971-2989, April.
    5. Ronald Kostoff & Raymond Koytcheff & Clifford Lau, 2008. "Structure of the nanoscience and nanotechnology applications literature," The Journal of Technology Transfer, Springer, vol. 33(5), pages 472-484, October.
    6. Johannes Stegmann & Guenter Grohmann, 2003. "Hypothesis generation guided by co-word clustering," Scientometrics, Springer;Akadémiai Kiadó, vol. 56(1), pages 111-135, January.
    7. Kostoff, R.N. & Tshiteya, R. & Pfeil, K.M. & Humenik, J.A. & Karypis, G., 2005. "Power source roadmaps using bibliometrics and database tomography," Energy, Elsevier, vol. 30(5), pages 709-730.
    8. Benito-Santos, Alejandro & Theron, Roberto, 2019. "Cross-domain Visual Exploration of Academic Corpora via the Latent Meaning of User-authored Keywords," OSF Preprints h29qv, Center for Open Science.
    9. Chihmao Hsieh, 2011. "Explicitly searching for useful inventions: dynamic relatedness and the costs of connecting versus synthesizing," Scientometrics, Springer;Akadémiai Kiadó, vol. 86(2), pages 381-404, February.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:drugsa:v:43:y:2020:i:1:d:10.1007_s40264-019-00872-9. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com/economics/journal/40264 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.