IDEAS home Printed from https://ideas.repec.org/a/nat/nathum/v9y2025i5d10.1038_s41562-025-02105-9.html
   My bibliography  Save this article

A unified acoustic-to-speech-to-language embedding space captures the neural basis of natural language processing in everyday conversations

Author

Listed:
  • Ariel Goldstein

    (Hebrew University
    Google Research)

  • Haocheng Wang

    (Princeton University)

  • Leonard Niekerken

    (Princeton University
    Maastricht University)

  • Mariano Schain

    (Google Research)

  • Zaid Zada

    (Princeton University)

  • Bobbi Aubrey

    (Princeton University)

  • Tom Sheffer

    (Google Research)

  • Samuel A. Nastase

    (Princeton University)

  • Harshvardhan Gazula

    (Princeton University
    Massachusetts General Hospital and Harvard Medical School)

  • Aditi Singh

    (Princeton University)

  • Aditi Rao

    (Princeton University)

  • Gina Choe

    (Princeton University)

  • Catherine Kim

    (Princeton University)

  • Werner Doyle

    (New York University School of Medicine)

  • Daniel Friedman

    (New York University School of Medicine)

  • Sasha Devore

    (New York University School of Medicine)

  • Patricia Dugan

    (New York University School of Medicine)

  • Avinatan Hassidim

    (Google Research)

  • Michael Brenner

    (Google Research
    Harvard University)

  • Yossi Matias

    (Google Research)

  • Orrin Devinsky

    (New York University School of Medicine)

  • Adeen Flinker

    (New York University School of Medicine)

  • Uri Hasson

    (Princeton University)

Abstract

This study introduces a unified computational framework connecting acoustic, speech and word-level linguistic structures to study the neural basis of everyday conversations in the human brain. We used electrocorticography to record neural signals across 100 h of speech production and comprehension as participants engaged in open-ended real-life conversations. We extracted low-level acoustic, mid-level speech and contextual word embeddings from a multimodal speech-to-text model (Whisper). We developed encoding models that linearly map these embeddings onto brain activity during speech production and comprehension. Remarkably, this model accurately predicts neural activity at each level of the language processing hierarchy across hours of new conversations not used in training the model. The internal processing hierarchy in the model is aligned with the cortical hierarchy for speech and language processing, where sensory and motor regions better align with the model’s speech embeddings, and higher-level language areas better align with the model’s language embeddings. The Whisper model captures the temporal sequence of language-to-speech encoding before word articulation (speech production) and speech-to-language encoding post articulation (speech comprehension). The embeddings learned by this model outperform symbolic models in capturing neural activity supporting natural speech and language. These findings support a paradigm shift towards unified computational models that capture the entire processing hierarchy for speech comprehension and production in real-world conversations.

Suggested Citation

  • Ariel Goldstein & Haocheng Wang & Leonard Niekerken & Mariano Schain & Zaid Zada & Bobbi Aubrey & Tom Sheffer & Samuel A. Nastase & Harshvardhan Gazula & Aditi Singh & Aditi Rao & Gina Choe & Catherin, 2025. "A unified acoustic-to-speech-to-language embedding space captures the neural basis of natural language processing in everyday conversations," Nature Human Behaviour, Nature, vol. 9(5), pages 1041-1055, May.
  • Handle: RePEc:nat:nathum:v:9:y:2025:i:5:d:10.1038_s41562-025-02105-9
    DOI: 10.1038/s41562-025-02105-9
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41562-025-02105-9
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1038/s41562-025-02105-9?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Ariel Goldstein & Avigail Grinstein-Dabush & Mariano Schain & Haocheng Wang & Zhuoqiao Hong & Bobbi Aubrey & Samuel A. Nastase & Zaid Zada & Eric Ham & Amir Feder & Harshvardhan Gazula & Eliav Buchnik, 2024. "Alignment of brain embeddings and artificial contextual embeddings in natural language points to common geometric patterns," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    2. Charlotte Caucheteux & Alexandre Gramfort & Jean-Rémi King, 2023. "Evidence of a predictive coding hierarchy in the human brain listening to speech," Nature Human Behaviour, Nature, vol. 7(3), pages 430-441, March.
    3. Ariel Goldstein & Avigail Grinstein-Dabush & Mariano Schain & Haocheng Wang & Zhuoqiao Hong & Bobbi Aubrey & Samuel A. Nastase & Zaid Zada & Eric Ham & Amir Feder & Harshvardhan Gazula & Eliav Buchnik, 2024. "Author Correction: Alignment of brain embeddings and artificial contextual embeddings in natural language points to common geometric patterns," Nature Communications, Nature, vol. 15(1), pages 1-1, December.
    4. Mattia Rigotti & Omri Barak & Melissa R. Warden & Xiao-Jing Wang & Nathaniel D. Daw & Earl K. Miller & Stefano Fusi, 2013. "The importance of mixed selectivity in complex cognitive tasks," Nature, Nature, vol. 497(7451), pages 585-590, May.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Fabian Grabenhorst & Raymundo Báez-Mendoza, 2025. "Dynamic coding and sequential integration of multiple reward attributes by primate amygdala neurons," Nature Communications, Nature, vol. 16(1), pages 1-21, December.
    2. Jan Weber & Anne-Kristin Solbakk & Alejandro O. Blenkmann & Anais Llorens & Ingrid Funderud & Sabine Leske & Pål Gunnar Larsson & Jugoslav Ivanovic & Robert T. Knight & Tor Endestad & Randolph F. Helf, 2024. "Ramping dynamics and theta oscillations reflect dissociable signatures during rule-guided human behavior," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    3. Pierre O. Boucher & Tian Wang & Laura Carceroni & Gary Kane & Krishna V. Shenoy & Chandramouli Chandrasekaran, 2023. "Initial conditions combine with sensory evidence to induce decision-related dynamics in premotor cortex," Nature Communications, Nature, vol. 14(1), pages 1-28, December.
    4. Wenyi Zhang & Yang Xie & Tianming Yang, 2022. "Reward salience but not spatial attention dominates the value representation in the orbitofrontal cortex," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
    5. Hagai Lalazar & L F Abbott & Eilon Vaadia, 2016. "Tuning Curves for Arm Posture Control in Motor Cortex Are Consistent with Random Connectivity," PLOS Computational Biology, Public Library of Science, vol. 12(5), pages 1-27, May.
    6. Jason S Prentice & Olivier Marre & Mark L Ioffe & Adrianna R Loback & Gašper Tkačik & Michael J Berry II, 2016. "Error-Robust Modes of the Retinal Population Code," PLOS Computational Biology, Public Library of Science, vol. 12(11), pages 1-32, November.
    7. Rishi Rajalingham & Hansem Sohn & Mehrdad Jazayeri, 2025. "Dynamic tracking of objects in the macaque dorsomedial frontal cortex," Nature Communications, Nature, vol. 16(1), pages 1-16, December.
    8. Sreejan Kumar & Theodore R. Sumers & Takateru Yamakoshi & Ariel Goldstein & Uri Hasson & Kenneth A. Norman & Thomas L. Griffiths & Robert D. Hawkins & Samuel A. Nastase, 2024. "Shared functional specialization in transformer-based language models and the human brain," Nature Communications, Nature, vol. 15(1), pages 1-19, December.
    9. Benjamin R Cowley & Matthew A Smith & Adam Kohn & Byron M Yu, 2016. "Stimulus-Driven Population Activity Patterns in Macaque Primary Visual Cortex," PLOS Computational Biology, Public Library of Science, vol. 12(12), pages 1-31, December.
    10. Seong-Hwan Hwang & Doyoung Park & Ji-Woo Lee & Sue-Hyun Lee & Hyoung F. Kim, 2024. "Convergent representation of values from tactile and visual inputs for efficient goal-directed behavior in the primate putamen," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    11. Spyridon Chavlis & Panayiota Poirazi, 2025. "Dendrites endow artificial neural networks with accurate, robust and parameter-efficient learning," Nature Communications, Nature, vol. 16(1), pages 1-17, December.
    12. Noel Federman & Sebastián A. Romano & Macarena Amigo-Duran & Lucca Salomon & Antonia Marin-Burgin, 2024. "Acquisition of non-olfactory encoding improves odour discrimination in olfactory cortex," Nature Communications, Nature, vol. 15(1), pages 1-19, December.
    13. Daniel Durstewitz, 2017. "A state space approach for piecewise-linear recurrent neural networks for identifying computational dynamics from neural measurements," PLOS Computational Biology, Public Library of Science, vol. 13(6), pages 1-33, June.
    14. Ziyan Huang & Myung Chung & Kentaro Tao & Akiyuki Watarai & Mu-Yun Wang & Hiroh Ito & Teruhiro Okuyama, 2023. "Ventromedial prefrontal neurons represent self-states shaped by vicarious fear in male mice," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    15. Dmitry R Lyamzin & Andrea Alamia & Mohammad Abdolrahmani & Ryo Aoki & Andrea Benucci, 2024. "Regularizing hyperparameters of interacting neural signals in the mouse cortex reflect states of arousal," PLOS Computational Biology, Public Library of Science, vol. 20(10), pages 1-25, October.
    16. David Kappel & Bernhard Nessler & Wolfgang Maass, 2014. "STDP Installs in Winner-Take-All Circuits an Online Approximation to Hidden Markov Model Learning," PLOS Computational Biology, Public Library of Science, vol. 10(3), pages 1-22, March.
    17. Laura E. Suárez & Agoston Mihalik & Filip Milisav & Kenji Marshall & Mingze Li & Petra E. Vértes & Guillaume Lajoie & Bratislav Misic, 2024. "Connectome-based reservoir computing with the conn2res toolbox," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    18. Takafumi Arakaki & G Barello & Yashar Ahmadian, 2019. "Inferring neural circuit structure from datasets of heterogeneous tuning curves," PLOS Computational Biology, Public Library of Science, vol. 15(4), pages 1-38, April.
    19. Alexandra Busch & Megan Roussy & Rogelio Luna & Matthew L. Leavitt & Maryam H. Mofrad & Roberto A. Gulli & Benjamin Corrigan & Ján Mináč & Adam J. Sachs & Lena Palaniyappan & Lyle Muller & Julio C. Ma, 2024. "Neuronal activation sequences in lateral prefrontal cortex encode visuospatial working memory during virtual navigation," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    20. Arno Onken & Jue Xie & Stefano Panzeri & Camillo Padoa-Schioppa, 2019. "Categorical encoding of decision variables in orbitofrontal cortex," PLOS Computational Biology, Public Library of Science, vol. 15(10), pages 1-27, October.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:nathum:v:9:y:2025:i:5:d:10.1038_s41562-025-02105-9. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.