IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0302217.html
   My bibliography  Save this article

Does small talk with a medical provider affect ChatGPT’s medical counsel? Performance of ChatGPT on USMLE with and without distractions

Author

Listed:
  • Myriam Safrai
  • Amos Azaria

Abstract

Efforts are being made to improve the time effectiveness of healthcare providers. Artificial intelligence tools can help transcript and summarize physician-patient encounters and produce medical notes and medical recommendations. However, in addition to medical information, discussion between healthcare and patients includes small talk and other information irrelevant to medical concerns. As Large Language Models (LLMs) are predictive models building their response based on the words in the prompts, there is a risk that small talk and irrelevant information may alter the response and the suggestion given. Therefore, this study aims to investigate the impact of medical data mixed with small talk on the accuracy of medical advice provided by ChatGPT. USMLE step 3 questions were used as a model for relevant medical data. We use both multiple-choice and open-ended questions. First, we gathered small talk sentences from human participants using the Mechanical Turk platform. Second, both sets of USLME questions were arranged in a pattern where each sentence from the original questions was followed by a small talk sentence. ChatGPT 3.5 and 4 were asked to answer both sets of questions with and without the small talk sentences. Finally, a board-certified physician analyzed the answers by ChatGPT and compared them to the formal correct answer. The analysis results demonstrate that the ability of ChatGPT-3.5 to answer correctly was impaired when small talk was added to medical data (66.8% vs. 56.6%; p = 0.025). Specifically, for multiple-choice questions (72.1% vs. 68.9%; p = 0.67) and for the open questions (61.5% vs. 44.3%; p = 0.01), respectively. In contrast, small talk phrases did not impair ChatGPT-4 ability in both types of questions (83.6% and 66.2%, respectively). According to these results, ChatGPT-4 seems more accurate than the earlier 3.5 version, and it appears that small talk does not impair its capability to provide medical recommendations. Our results are an important first step in understanding the potential and limitations of utilizing ChatGPT and other LLMs for physician-patient interactions, which include casual conversations.

Suggested Citation

  • Myriam Safrai & Amos Azaria, 2024. "Does small talk with a medical provider affect ChatGPT’s medical counsel? Performance of ChatGPT on USMLE with and without distractions," PLOS ONE, Public Library of Science, vol. 19(4), pages 1-13, April.
  • Handle: RePEc:plo:pone00:0302217
    DOI: 10.1371/journal.pone.0302217
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0302217
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0302217&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0302217?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Paolacci, Gabriele & Chandler, Jesse & Ipeirotis, Panagiotis G., 2010. "Running experiments on Amazon Mechanical Turk," Judgment and Decision Making, Cambridge University Press, vol. 5(5), pages 411-419, August.
    2. Wei, Shuang & Mao, Yansheng, 2023. "Small talk is a big deal: A discursive analysis of online off-topic doctor-patient interaction in Traditional Chinese Medicine," Social Science & Medicine, Elsevier, vol. 317(C).
    3. Gabriele Paolacci & Jesse Chandler & Panagiotis G. Ipeirotis, 2010. "Running experiments on Amazon Mechanical Turk," Judgment and Decision Making, Society for Judgment and Decision Making, vol. 5(5), pages 411-419, August.
    4. repec:cup:judgdm:v:5:y:2010:i:5:p:411-419 is not listed on IDEAS
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Sweldens, Steven & Puntoni, Stefano & Paolacci, Gabriele & Vissers, Maarten, 2014. "The bias in the bias: Comparative optimism as a function of event social undesirability," Organizational Behavior and Human Decision Processes, Elsevier, vol. 124(2), pages 229-244.
    2. Hsu, Dan K. & Burmeister-Lamp, Katrin & Simmons, Sharon A. & Foo, Maw-Der & Hong, Michelle C. & Pipes, Jesse D., 2019. "“I know I can, but I don't fit”: Perceived fit, self-efficacy, and entrepreneurial intention," Journal of Business Venturing, Elsevier, vol. 34(2), pages 311-326.
    3. Lutz, Christoph & Newlands, Gemma, 2018. "Consumer segmentation within the sharing economy: The case of Airbnb," Journal of Business Research, Elsevier, vol. 88(C), pages 187-196.
    4. Mariconda, Simone & Lurati, Francesco, 2015. "Does familiarity breed stability? The role of familiarity in moderating the effects of new information on reputation judgments," Journal of Business Research, Elsevier, vol. 68(5), pages 957-964.
    5. Tobias Schlager & Ashley V. Whillans, 2022. "People underestimate the probability of contracting the coronavirus from friends," Palgrave Communications, Palgrave Macmillan, vol. 9(1), pages 1-11, December.
    6. Charness, Gary & Gneezy, Uri & Kuhn, Michael A., 2013. "Experimental methods: Extra-laboratory experiments-extending the reach of experimental economics," Journal of Economic Behavior & Organization, Elsevier, vol. 91(C), pages 93-100.
    7. Orazi, Davide C. & Pizzetti, Marta, 2015. "Revisiting fear appeals: A structural re-inquiry of the protection motivation model," International Journal of Research in Marketing, Elsevier, vol. 32(2), pages 223-225.
    8. Cantarella, Michele & Strozzi, Chiara, 2019. "Workers in the Crowd: The Labour Market Impact of the Online Platform Economy," IZA Discussion Papers 12327, Institute of Labor Economics (IZA).
    9. Gökçe Esenduran & James A. Hill & In Joon Noh, 2020. "Understanding the Choice of Online Resale Channel for Used Electronics," Production and Operations Management, Production and Operations Management Society, vol. 29(5), pages 1188-1211, May.
    10. Azzam, Tarek & Harman, Elena, 2016. "Crowdsourcing for quantifying transcripts: An exploratory study," Evaluation and Program Planning, Elsevier, vol. 54(C), pages 63-73.
    11. repec:cup:judgdm:v:9:y:2014:i:3:p:287-296 is not listed on IDEAS
    12. Ronayne, David & Sgroi, Daniel & Tuckwell, Anthony, 2021. "Evaluating the sunk cost effect," Journal of Economic Behavior & Organization, Elsevier, vol. 186(C), pages 318-327.
    13. Gandullia, Luca & Lezzi, Emanuela & Parciasepe, Paolo, 2020. "Replication with MTurk of the experimental design by Gangadharan, Grossman, Jones & Leister (2018): Charitable giving across donor types," Journal of Economic Psychology, Elsevier, vol. 78(C).
    14. Prissé, Benjamin & Jorrat, Diego, 2022. "Lab vs online experiments: No differences," Journal of Behavioral and Experimental Economics (formerly The Journal of Socio-Economics), Elsevier, vol. 100(C).
    15. Efrat Dressler & Yevgeny Mugerman, 2023. "Doing the Right Thing? The Voting Power Effect and Institutional Shareholder Voting," Journal of Business Ethics, Springer, vol. 183(4), pages 1089-1112, April.
    16. Valerio Capraro & Hélène Barcelo, 2021. "Punishing defectors and rewarding cooperators: Do people discriminate between genders?," Journal of the Economic Science Association, Springer;Economic Science Association, vol. 7(1), pages 19-32, September.
    17. Gupta, Vishal K. & Goktan, A. Banu & Gunay, Gonca, 2014. "Gender differences in evaluation of new business opportunity: A stereotype threat perspective," Journal of Business Venturing, Elsevier, vol. 29(2), pages 273-288.
    18. Garbarino, Ellen & Slonim, Robert & Villeval, Marie Claire, 2019. "Loss aversion and lying behavior," Journal of Economic Behavior & Organization, Elsevier, vol. 158(C), pages 379-393.
    19. Lefgren, Lars J. & Sims, David P. & Stoddard, Olga B., 2016. "Effort, luck, and voting for redistribution," Journal of Public Economics, Elsevier, vol. 143(C), pages 89-97.
    20. Dahling, Jason J. & Wiley, Shaun & Fishman, Zachary A. & Loihle, Amber, 2016. "A stake in the fight: When do heterosexual employees resist organizational policies that deny marriage equality to LGB peers?," Organizational Behavior and Human Decision Processes, Elsevier, vol. 132(C), pages 1-15.
    21. Lingmont, Derek N.J. & Alexiou, Andreas, 2020. "The contingent effect of job automating technology awareness on perceived job insecurity: Exploring the moderating role of organizational culture," Technological Forecasting and Social Change, Elsevier, vol. 161(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0302217. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.