Does small talk with a medical provider affect ChatGPT’s medical counsel? Performance of ChatGPT on USMLE with and without distractions

Does small talk with a medical provider affect ChatGPT’s medical counsel? Performance of ChatGPT on USMLE with and without distractions

Author

Listed:

Myriam Safrai
Amos Azaria

Abstract

Efforts are being made to improve the time effectiveness of healthcare providers. Artificial intelligence tools can help transcript and summarize physician-patient encounters and produce medical notes and medical recommendations. However, in addition to medical information, discussion between healthcare and patients includes small talk and other information irrelevant to medical concerns. As Large Language Models (LLMs) are predictive models building their response based on the words in the prompts, there is a risk that small talk and irrelevant information may alter the response and the suggestion given. Therefore, this study aims to investigate the impact of medical data mixed with small talk on the accuracy of medical advice provided by ChatGPT. USMLE step 3 questions were used as a model for relevant medical data. We use both multiple-choice and open-ended questions. First, we gathered small talk sentences from human participants using the Mechanical Turk platform. Second, both sets of USLME questions were arranged in a pattern where each sentence from the original questions was followed by a small talk sentence. ChatGPT 3.5 and 4 were asked to answer both sets of questions with and without the small talk sentences. Finally, a board-certified physician analyzed the answers by ChatGPT and compared them to the formal correct answer. The analysis results demonstrate that the ability of ChatGPT-3.5 to answer correctly was impaired when small talk was added to medical data (66.8% vs. 56.6%; p = 0.025). Specifically, for multiple-choice questions (72.1% vs. 68.9%; p = 0.67) and for the open questions (61.5% vs. 44.3%; p = 0.01), respectively. In contrast, small talk phrases did not impair ChatGPT-4 ability in both types of questions (83.6% and 66.2%, respectively). According to these results, ChatGPT-4 seems more accurate than the earlier 3.5 version, and it appears that small talk does not impair its capability to provide medical recommendations. Our results are an important first step in understanding the potential and limitations of utilizing ChatGPT and other LLMs for physician-patient interactions, which include casual conversations.

Suggested Citation

Myriam Safrai & Amos Azaria, 2024. "Does small talk with a medical provider affect ChatGPT’s medical counsel? Performance of ChatGPT on USMLE with and without distractions," PLOS ONE, Public Library of Science, vol. 19(4), pages 1-13, April.

Handle: RePEc:plo:pone00:0302217
DOI: 10.1371/journal.pone.0302217

Download full text from publisher

References listed on IDEAS

Gabriele Paolacci & Jesse Chandler & Panagiotis G. Ipeirotis, 2010. "Running experiments on Amazon Mechanical Turk," Judgment and Decision Making, Society for Judgment and Decision Making, vol. 5(5), pages 411-419, August.
Paolacci, Gabriele & Chandler, Jesse & Ipeirotis, Panagiotis G., 2010. "Running experiments on Amazon Mechanical Turk," Judgment and Decision Making, Cambridge University Press, vol. 5(5), pages 411-419, August.
Wei, Shuang & Mao, Yansheng, 2023. "Small talk is a big deal: A discursive analysis of online off-topic doctor-patient interaction in Traditional Chinese Medicine," Social Science & Medicine, Elsevier, vol. 317(C).

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Yamada, Katsunori & Sato, Masayuki, 2013. "Another avenue for anatomy of income comparisons: Evidence from hypothetical choice experiments," Journal of Economic Behavior & Organization, Elsevier, vol. 89(C), pages 35-57.
Sweldens, Steven & Puntoni, Stefano & Paolacci, Gabriele & Vissers, Maarten, 2014. "The bias in the bias: Comparative optimism as a function of event social undesirability," Organizational Behavior and Human Decision Processes, Elsevier, vol. 124(2), pages 229-244.
S. Venus Jin & Aziz Muqaddam, 2019. "Product placement 2.0: “Do Brands Need Influencers, or Do Influencers Need Brands?”," Journal of Brand Management, Palgrave Macmillan, vol. 26(5), pages 522-537, September.
Hsu, Dan K. & Burmeister-Lamp, Katrin & Simmons, Sharon A. & Foo, Maw-Der & Hong, Michelle C. & Pipes, Jesse D., 2019. "“I know I can, but I don't fit”: Perceived fit, self-efficacy, and entrepreneurial intention," Journal of Business Venturing, Elsevier, vol. 34(2), pages 311-326.
Lutz, Christoph & Newlands, Gemma, 2018. "Consumer segmentation within the sharing economy: The case of Airbnb," Journal of Business Research, Elsevier, vol. 88(C), pages 187-196.
Mariconda, Simone & Lurati, Francesco, 2015. "Does familiarity breed stability? The role of familiarity in moderating the effects of new information on reputation judgments," Journal of Business Research, Elsevier, vol. 68(5), pages 957-964.
Gandullia, Luca & Lezzi, Emanuela, 2018. "The price elasticity of charitable giving: New experimental evidence," Economics Letters, Elsevier, vol. 173(C), pages 88-91.
Tobias Schlager & Ashley V. Whillans, 2022. "People underestimate the probability of contracting the coronavirus from friends," Humanities and Social Sciences Communications, Palgrave Macmillan, vol. 9(1), pages 1-11, December.
Charness, Gary & Gneezy, Uri & Kuhn, Michael A., 2013. "Experimental methods: Extra-laboratory experiments-extending the reach of experimental economics," Journal of Economic Behavior & Organization, Elsevier, vol. 91(C), pages 93-100.
Gerhard, Patrick & Hoffmann, Arvid O.I. & Post, Thomas, 2017. "Past performance framing and investors’ belief updating: Is seeing long-term returns always associated with smaller belief updates?," Journal of Behavioral and Experimental Finance, Elsevier, vol. 15(C), pages 38-51.
Orazi, Davide C. & Pizzetti, Marta, 2015. "Revisiting fear appeals: A structural re-inquiry of the protection motivation model," International Journal of Research in Marketing, Elsevier, vol. 32(2), pages 223-225.
Haas, Nicholas & Hassan, Mazen & Mansour, Sarah & Morton, Rebecca B., 2021. "Polarizing information and support for reform," Journal of Economic Behavior & Organization, Elsevier, vol. 185(C), pages 883-901.
Cantarella, Michele & Strozzi, Chiara, 2019. "Workers in the Crowd: The Labour Market Impact of the Online Platform Economy," IZA Discussion Papers 12327, Institute of Labor Economics (IZA).
Armenak Antinyan & Luca Corazzini & Filippo Pavesi, 2018. "What Matters for Whistleblowing on Tax Evaders? Survey and Experimental Evidence," Working Papers 07/2018, University of Verona, Department of Economics.
Hindsley, Paul & McEvoy, David M. & Morgan, O. Ashton, 2020. "Consumer Demand for Ethical Products and the Role of Cultural Worldviews: The Case of Direct-Trade Coffee," Ecological Economics, Elsevier, vol. 177(C).
- Paul Hindsley & David M. McEvoy & O. Ashton Morgan, 2019. "Consumer Demand for Ethical Products and the Role of Cultural Worldviews: The Case of Direct-Trade Coffee," Working Papers 19-09, Department of Economics, Appalachian State University.
Gökçe Esenduran & James A. Hill & In Joon Noh, 2020. "Understanding the Choice of Online Resale Channel for Used Electronics," Production and Operations Management, Production and Operations Management Society, vol. 29(5), pages 1188-1211, May.
Azzam, Tarek & Harman, Elena, 2016. "Crowdsourcing for quantifying transcripts: An exploratory study," Evaluation and Program Planning, Elsevier, vol. 54(C), pages 63-73.
Gonzalez-Cabello, Martin & Siddiq, Auyon & Corbett, Charles J. & Hu, Catherine, 2025. "Fairness in crowdwork: Making the human AI supply chain more humane," Business Horizons, Elsevier, vol. 68(5), pages 645-657.
Autrey, Romana L. & Bauer, Tim D. & Jackson, Kevin E. & Klevsky, Elena, 2019. "Deploying “connectors”: A control to manage employee turnover intentions?," Accounting, Organizations and Society, Elsevier, vol. 79(C).
Shabnam H. A. Zanjani & George R. Milne & Elizabeth G. Miller, 2016. "Procrastinators’ online experience and purchase behavior," Journal of the Academy of Marketing Science, Springer, vol. 44(5), pages 568-585, September.

More about this item

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0302217. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Does small talk with a medical provider affect ChatGPT’s medical counsel? Performance of ChatGPT on USMLE with and without distractions

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data