Author
Listed:
- Tao Tu
(Google Research)
- Mike Schaekermann
(Google Research)
- Anil Palepu
(Google Research)
- Khaled Saab
(Google Research)
- Jan Freyberg
(Google Research)
- Ryutaro Tanno
(Google DeepMind)
- Amy Wang
(Google Research)
- Brenna Li
(Google Research)
- Mohamed Amin
(Google Research)
- Yong Cheng
(Google DeepMind)
- Elahe Vedadi
(Google Research)
- Nenad Tomasev
(Google DeepMind)
- Shekoofeh Azizi
(Google DeepMind)
- Karan Singhal
(Google Research)
- Le Hou
(Google Research)
- Albert Webson
(Google DeepMind)
- Kavita Kulkarni
(Google Research)
- S. Sara Mahdavi
(Google DeepMind)
- Christopher Semturs
(Google Research)
- Juraj Gottweis
(Google Research)
- Joelle Barral
(Google DeepMind)
- Katherine Chou
(Google Research)
- Greg S. Corrado
(Google Research)
- Yossi Matias
(Google Research)
- Alan Karthikesalingam
(Google Research)
- Vivek Natarajan
(Google Research)
Abstract
At the heart of medicine lies physician–patient dialogue, where skillful history-taking enables effective diagnosis, management and enduring trust1,2. Artificial intelligence (AI) systems capable of diagnostic dialogue could increase accessibility and quality of care. However, approximating clinicians’ expertise is an outstanding challenge. Here we introduce AMIE (Articulate Medical Intelligence Explorer), a large language model (LLM)-based AI system optimized for diagnostic dialogue. AMIE uses a self-play-based3 simulated environment with automated feedback for scaling learning across disease conditions, specialties and contexts. We designed a framework for evaluating clinically meaningful axes of performance, including history-taking, diagnostic accuracy, management, communication skills and empathy. We compared AMIE’s performance to that of primary care physicians in a randomized, double-blind crossover study of text-based consultations with validated patient-actors similar to objective structured clinical examination4,5. The study included 159 case scenarios from providers in Canada, the United Kingdom and India, 20 primary care physicians compared to AMIE, and evaluations by specialist physicians and patient-actors. AMIE demonstrated greater diagnostic accuracy and superior performance on 30 out of 32 axes according to the specialist physicians and 25 out of 26 axes according to the patient-actors. Our research has several limitations and should be interpreted with caution. Clinicians used synchronous text chat, which permits large-scale LLM–patient interactions, but this is unfamiliar in clinical practice. While further research is required before AMIE could be translated to real-world settings, the results represent a milestone towards conversational diagnostic AI.
Suggested Citation
Tao Tu & Mike Schaekermann & Anil Palepu & Khaled Saab & Jan Freyberg & Ryutaro Tanno & Amy Wang & Brenna Li & Mohamed Amin & Yong Cheng & Elahe Vedadi & Nenad Tomasev & Shekoofeh Azizi & Karan Singha, 2025.
"Towards conversational diagnostic artificial intelligence,"
Nature, Nature, vol. 642(8067), pages 442-450, June.
Handle:
RePEc:nat:nature:v:642:y:2025:i:8067:d:10.1038_s41586-025-08866-7
DOI: 10.1038/s41586-025-08866-7
Download full text from publisher
As the access to this document is restricted, you may want to search for a different version of it.
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:nature:v:642:y:2025:i:8067:d:10.1038_s41586-025-08866-7. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.