IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0253905.html
   My bibliography  Save this article

CBAG: Conditional biomedical abstract generation

Author

Listed:
  • Justin Sybrandt
  • Ilya Safro

Abstract

Biomedical research papers often combine disjoint concepts in novel ways, such as when describing a newly discovered relationship between an understudied gene with an important disease. These concepts are often explicitly encoded as metadata keywords, such as the author-provided terms included with many documents in the MEDLINE database. While substantial recent work has addressed the problem of text generation in a more general context, applications, such as scientific writing assistants, or hypothesis generation systems, could benefit from the capacity to select the specific set of concepts that underpin a generated biomedical text. We propose a conditional language model following the transformer architecture. This model uses the “encoder stack” to encode concepts that a user wishes to discuss in the generated text. The “decoder stack” then follows the masked self-attention pattern to perform text generation, using both prior tokens as well as the encoded condition. We demonstrate that this approach provides significant control, while still producing reasonable biomedical text.

Suggested Citation

  • Justin Sybrandt & Ilya Safro, 2021. "CBAG: Conditional biomedical abstract generation," PLOS ONE, Public Library of Science, vol. 16(7), pages 1-18, July.
  • Handle: RePEc:plo:pone00:0253905
    DOI: 10.1371/journal.pone.0253905
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0253905
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0253905&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0253905?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Megan L Head & Luke Holman & Rob Lanfear & Andrew T Kahn & Michael D Jennions, 2015. "The Extent and Consequences of P-Hacking in Science," PLOS Biology, Public Library of Science, vol. 13(3), pages 1-15, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Abel Brodeur, Nikolai M. Cook, Anthony Heyes, 2022. "We Need to Talk about Mechanical Turk: What 22,989 Hypothesis Tests Tell Us about Publication Bias and p-Hacking in Online Experiments," LCERPA Working Papers am0133, Laurier Centre for Economic Research and Policy Analysis.
    2. Jasper Brinkerink, 2023. "When Shooting for the Stars Becomes Aiming for Asterisks: P-Hacking in Family Business Research," Entrepreneurship Theory and Practice, , vol. 47(2), pages 304-343, March.
    3. Arnaud Vaganay, 2016. "Cluster Sampling Bias in Government-Sponsored Evaluations: A Correlational Study of Employment and Welfare Pilots in England," PLOS ONE, Public Library of Science, vol. 11(8), pages 1-21, August.
    4. Rosa Lavelle-Hill & Gavin Smith & Anjali Mazumder & Todd Landman & James Goulding, 2021. "Machine learning methods for “wicked” problems: exploring the complex drivers of modern slavery," Palgrave Communications, Palgrave Macmillan, vol. 8(1), pages 1-11, December.
    5. David Winkelmann & Marius Ötting & Christian Deutscher & Tomasz Makarewicz, 2024. "Are Betting Markets Inefficient? Evidence From Simulations and Real Data," Journal of Sports Economics, , vol. 25(1), pages 54-97, January.
    6. Graham Elliott & Nikolay Kudrin & Kaspar Wüthrich, 2022. "Detecting p‐Hacking," Econometrica, Econometric Society, vol. 90(2), pages 887-906, March.
    7. Konrad Neumann & Ulrike Grittner & Sophie K Piper & Andre Rex & Oscar Florez-Vargas & George Karystianis & Alice Schneider & Ian Wellwood & Bob Siegerink & John P A Ioannidis & Jonathan Kimmelman & Ul, 2017. "Increasing efficiency of preclinical research by group sequential designs," PLOS Biology, Public Library of Science, vol. 15(3), pages 1-9, March.
    8. Stephan B Bruns & John P A Ioannidis, 2016. "p-Curve and p-Hacking in Observational Research," PLOS ONE, Public Library of Science, vol. 11(2), pages 1-13, February.
    9. Miguel Baiao & Ilze Buligina, 2021. "Work Experience Led Programs and Employment Attainment," International Journal of Economics & Business Administration (IJEBA), International Journal of Economics & Business Administration (IJEBA), vol. 0(1), pages 180-198.
    10. Brodeur, Abel & Cook, Nikolai & Heyes, Anthony, 2022. "We Need to Talk about Mechanical Turk: What 22,989 Hypothesis Tests Tell us about p-Hacking and Publication Bias in Online Experiments," GLO Discussion Paper Series 1157, Global Labor Organization (GLO).
    11. Julia Roloff & Michael J. Zyphur, 2019. "Null Findings, Replications and Preregistered Studies in Business Ethics Research," Journal of Business Ethics, Springer, vol. 160(3), pages 609-619, December.
    12. Ingmar Böschen, 2021. "Software review: The JATSdecoder package—extract metadata, abstract and sectioned text from NISO-JATS coded XML documents; Insights to PubMed central’s open access database," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(12), pages 9585-9601, December.
    13. Freuli, Francesca & Held, Leonhard & Heyard, Rachel, 2022. "Replication Success under Questionable Research Practices - A Simulation Study," I4R Discussion Paper Series 2, The Institute for Replication (I4R).
    14. Graham Elliott & Nikolay Kudrin & Kaspar Wuthrich, 2022. "The Power of Tests for Detecting $p$-Hacking," Papers 2205.07950, arXiv.org, revised Apr 2024.
    15. Marko Kovic & Nina Hänsli, 2017. "The Impact of Political Cleavages, Religiosity, and Values on Attitudes towards Nonprofit Organizations," Social Sciences, MDPI, vol. 7(1), pages 1-18, December.
    16. Martin E Héroux & Janet L Taylor & Simon C Gandevia, 2015. "The Use and Abuse of Transcranial Magnetic Stimulation to Modulate Corticospinal Excitability in Humans," PLOS ONE, Public Library of Science, vol. 10(12), pages 1-10, December.
    17. Pierre J C Chuard & Milan Vrtílek & Megan L Head & Michael D Jennions, 2019. "Evidence that nonsignificant results are sometimes preferred: Reverse P-hacking or selective reporting?," PLOS Biology, Public Library of Science, vol. 17(1), pages 1-7, January.
    18. Bilgin, Rumeysa, 2023. "The Selection Of Control Variables In Capital Structure Research With Machine Learning," SocArXiv e26qf, Center for Open Science.
    19. Tracey L Weissgerber, 2021. "Learning from the past to develop data analysis curricula for the future," PLOS Biology, Public Library of Science, vol. 19(7), pages 1-3, July.
    20. van Aert, Robbie Cornelis Maria & van Assen, Marcel A. L. M., 2018. "P-uniform," MetaArXiv zqjr9, Center for Open Science.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0253905. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.