IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0311238.html
   My bibliography  Save this article

Needle in a haystack: Harnessing AI in drug patent searches and prediction

Author

Listed:
  • Leonardo Costa Ribeiro
  • Valbona Muzaka

Abstract

The classification codes granted by patent offices are useful instruments for simplifying the bewildering variety of patents in existence. They are singularly unhelpful, however, in locating a specific subgroup of patents such as that of drug-related pharmaceutical patents for which no classification codes exist. Taking advantage of advances in artificial intelligence and in natural language processing in particular, we offer a new method of identifying chemical drug-related patents in this article. The aim is primarily that of demonstrating how the proverbial needle in a haystack was identified, namely through leveraging the superb pattern-recognition abilities of the BERT (Bidirectional Encoder Representations from Transformers) algorithm. We build three different databases to train our algorithm and fine-tune its abilities to identify the patent group in question by exposing it to additional texts containing structures that are much more likely to be present in them, until we obtain the highest possible F1-score, combined with an accuracy of 94.40%. We also demonstrate some possible uses of the algorithm. Its application to the US patent office database enables the identification of potential chemical drug patents up to ten years before drug approval, whereas its application to the German patent office reveals the regional nature of drug R&D and patenting strategies. The hope is that both the method proposed and its applications will be further refined and expanded forthwith.

Suggested Citation

  • Leonardo Costa Ribeiro & Valbona Muzaka, 2024. "Needle in a haystack: Harnessing AI in drug patent searches and prediction," PLOS ONE, Public Library of Science, vol. 19(12), pages 1-24, December.
  • Handle: RePEc:plo:pone00:0311238
    DOI: 10.1371/journal.pone.0311238
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0311238
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0311238&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0311238?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Edwin Mansfield, 1986. "Patents and Innovation: An Empirical Study," Management Science, INFORMS, vol. 32(2), pages 173-181, February.
    2. Machlup, Fritz & Penrose, Edith, 1950. "The Patent Controversy in the Nineteenth Century," The Journal of Economic History, Cambridge University Press, vol. 10(1), pages 1-29, May.
    3. Jack W Scannell & Jim Bosley, 2016. "When Quality Beats Quantity: Decision Theory, Drug Discovery, and the Reproducibility Crisis," PLOS ONE, Public Library of Science, vol. 11(2), pages 1-21, February.
    4. Benjamin Balsmeier & Mohamad Assaf & Tyler Chesebro & Gabe Fierro & Kevin Johnson & Scott Johnson & Guan‐Cheng Li & Sonja Lück & Doug O'Reagan & Bill Yeh & Guangzheng Zang & Lee Fleming, 2018. "Machine learning and natural language processing on the patent corpus: Data, tools, and new measures," Journal of Economics & Management Strategy, Wiley Blackwell, vol. 27(3), pages 535-553, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jeffrey L. Furman & Markus Nagler & Martin Watzinger, 2021. "Disclosure and Subsequent Innovation: Evidence from the Patent Depository Library Program," American Economic Journal: Economic Policy, American Economic Association, vol. 13(4), pages 239-270, November.
    2. Bernhard Ganglmair & Imke Reimers, 2019. "Visibility of Technology and Cumulative Innovation: Evidence from Trade Secrets Laws," CRC TR 224 Discussion Paper Series crctr224_2019_119v1, University of Bonn and University of Mannheim, Germany.
    3. Harabi, Najib, 1994. "Technischer Fortschritt in der Schweiz: Empirische Ergebnisse aus industrieökonomischer Sicht [Technischer Fortschritt in der Schweiz:Empirische Ergebnisse aus industrieökonomischer Sicht]," MPRA Paper 6725, University Library of Munich, Germany.
    4. Petra Moser, 2012. "Innovation without Patents: Evidence from World's Fairs," Journal of Law and Economics, University of Chicago Press, vol. 55(1), pages 43-74.
    5. Armin Mertens & Marc Scheufen, 2024. "Intellectual property and fourth industrial revolution technologies: how the patent system is shaping the future in the data-driven economy," European Journal of Law and Economics, Springer, vol. 57(1), pages 275-310, April.
    6. Kimmel, Randall K. & Antenucci, Robert & Hasan, Shahriar, 2017. "Investor perception and business method patents: A natural experiment," Economic Analysis and Policy, Elsevier, vol. 54(C), pages 26-48.
    7. Ute Laermann-Nguyen & Martin Backfisch, 2021. "Innovation crisis in the pharmaceutical industry? A survey," SN Business & Economics, Springer, vol. 1(12), pages 1-37, December.
    8. Maestracci, Aria, 2023. "An Examination of the Economic and Social Impacts of Corporate Innovation and Interventions," MPRA Paper 116932, University Library of Munich, Germany, revised 03 Feb 2023.
    9. Ozgur Aydogmus & Erkan Gürpinar, 2022. "Science, Technology and Institutional Change in Knowledge Production: An Evolutionary Game Theoretic Framework," Dynamic Games and Applications, Springer, vol. 12(4), pages 1163-1188, December.
    10. Pauly, Stefan & Stipanicic, Fernando, 2021. "The creation and diffusion of knowledge: Evidence from the Jet Age," CEPREMAP Working Papers (Docweb) 2112, CEPREMAP.
    11. Wagenaar, Homer & Colvin, Christopher L., 2025. "Patently peculiar: Patents and innovation in the United Kingdom of the Netherlands," QUCEH Working Paper Series 25-04, Queen's University Belfast, Queen's University Centre for Economic History, revised 2025.
    12. Yu-Shan Chen & Ke-Chiun Chang, 2009. "Using neural network to analyze the influence of the patent performance upon the market value of the US pharmaceutical companies," Scientometrics, Springer;Akadémiai Kiadó, vol. 80(3), pages 637-655, September.
    13. Shavell, Steven & van Ypersele, Tanguy, 2001. "Rewards versus Intellectual Property Rights," Journal of Law and Economics, University of Chicago Press, vol. 44(2), pages 525-547, October.
    14. Fontana, Roberto & Nuvolari, Alessandro & Shimizu, Hiroshi & Vezzulli, Andrea, 2013. "Reassessing patent propensity: Evidence from a dataset of R&D awards, 1977–2004," Research Policy, Elsevier, vol. 42(10), pages 1780-1792.
    15. Barge-Gil, Andrés & López, Alberto, 2014. "R&D determinants: Accounting for the differences between research and development," Research Policy, Elsevier, vol. 43(9), pages 1634-1648.
    16. Elif Bascavusoglu & Maria Pluvia Zuniga, 2005. "The effects of intellectual property protection on international knowledge contracting," Cahiers de la Maison des Sciences Economiques bla05009, Université Panthéon-Sorbonne (Paris 1).
    17. B. Zorina Khan & Kenneth L. Sokoloff, 2004. "Institutions and Technological Innovation During the Early Economic Growth: Evidence from the Great Inventors of the United States, 1790-1930," NBER Working Papers 10966, National Bureau of Economic Research, Inc.
    18. Dolfsma, W.A., 2006. "IPRs, Technological Development, and Economic Development," ERIM Report Series Research in Management ERS-2006-004-ORG, Erasmus Research Institute of Management (ERIM), ERIM is the joint research institute of the Rotterdam School of Management, Erasmus University and the Erasmus School of Economics (ESE) at Erasmus University Rotterdam.
    19. Sternitzke, Christian, 2013. "An exploratory analysis of patent fencing in pharmaceuticals: The case of PDE5 inhibitors," Research Policy, Elsevier, vol. 42(2), pages 542-551.
    20. Scherer, F.M., 2010. "Pharmaceutical Innovation," Handbook of the Economics of Innovation, in: Bronwyn H. Hall & Nathan Rosenberg (ed.), Handbook of the Economics of Innovation, edition 1, volume 1, chapter 0, pages 539-574, Elsevier.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0311238. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.