Real-time synthesis of imagined speech processes from minimally invasive recordings of neural activity

被引:69
作者
Angrick, Miguel [1 ]
Ottenhoff, Maarten C. [2 ]
Diener, Lorenz [1 ]
Ivucic, Darius [1 ]
Ivucic, Gabriel [1 ]
Goulis, Sophocles [2 ]
Saal, Jeremy [2 ]
Colon, Albert J. [3 ]
Wagner, Louis [3 ]
Krusienski, Dean J. [4 ]
Kubben, Pieter L. [2 ,5 ]
Schultz, Tanja [1 ]
Herff, Christian [2 ]
机构
[1] Univ Bremen, Cognit Syst Lab, Bremen, Germany
[2] Maastricht Univ, Sch Mental Hlth & Neurosci, Dept Neurosurg, Maastricht, Netherlands
[3] Kempenhaeghe Maastricht Univ, Med Ctr, Acad Ctr Epileptol, Kempenhaeghe, Netherlands
[4] Virginia Commonwealth Univ, Biomed Engn Dept, ASPEN Lab, Richmond, VA USA
[5] Kempenhaeghe Maastricht Univ, Med Ctr, Acad Ctr Epileptol, Maastricht, Netherlands
基金
荷兰研究理事会; 美国国家科学基金会;
关键词
BRAIN-COMPUTER INTERFACE; GAMMA ACTIVITY; FEEDBACK; SPOKEN;
D O I
10.1038/s42003-021-02578-0
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Miguel Angrick et al. develop an intracranial EEG-based method to decode imagined speech from a human patient and translate it into audible speech in real-time. This report presents an important proof of concept that acoustic output can be reconstructed on the basis of neural signals, and serves as a valuable step in the development of neuroprostheses to help nonverbal patients interact with their environment. Speech neuroprosthetics aim to provide a natural communication channel to individuals who are unable to speak due to physical or neurological impairments. Real-time synthesis of acoustic speech directly from measured neural activity could enable natural conversations and notably improve quality of life, particularly for individuals who have severely limited means of communication. Recent advances in decoding approaches have led to high quality reconstructions of acoustic speech from invasively measured neural activity. However, most prior research utilizes data collected during open-loop experiments of articulated speech, which might not directly translate to imagined speech processes. Here, we present an approach that synthesizes audible speech in real-time for both imagined and whispered speech conditions. Using a participant implanted with stereotactic depth electrodes, we were able to reliably generate audible speech in real-time. The decoding models rely predominately on frontal activity suggesting that speech processes have similar representations when vocalized, whispered, or imagined. While reconstructed audio is not yet intelligible, our real-time synthesis approach represents an essential step towards investigating how patients will learn to operate a closed-loop speech neuroprosthesis based on imagined speech.
引用
收藏
页数:10
相关论文
共 56 条
[1]   Towards reconstructing intelligible speech from the human auditory cortex [J].
Akbari, Hassan ;
Khalighinejad, Bahar ;
Herrero, Jose L. ;
Mehta, Ashesh D. ;
Mesgarani, Nima .
SCIENTIFIC REPORTS, 2019, 9 (1)
[2]  
Angrick M., 2020, INTERSPEECH
[3]   Speech synthesis from ECoG using densely connected 3D convolutional neural networks [J].
Angrick, Miguel ;
Herff, Christian ;
Mugler, Emily ;
Tate, Matthew C. ;
Slutzky, Marc W. ;
Krusienski, Dean J. ;
Schultz, Tanja .
JOURNAL OF NEURAL ENGINEERING, 2019, 16 (03)
[4]  
[Anonymous], 2011, IEEE WORKSHOP AUTOMA
[5]   Speech synthesis from neural decoding of spoken sentences [J].
Anumanchipalli, Gopala K. ;
Chartier, Josh ;
Chang, Edward F. .
NATURE, 2019, 568 (7753) :493-+
[6]   Spatio-Temporal Progression of Cortical Activity Related to Continuous Overt and Covert Speech Production in a Reading Task [J].
Brumberg, Jonathan S. ;
Krusienski, Dean J. ;
Chakrabarti, Shreya ;
Gunduz, Aysegul ;
Brunner, Peter ;
Ritaccio, Anthony L. ;
Schalk, Gerwin .
PLOS ONE, 2016, 11 (11)
[7]   Stereoelectroencephalography: Surgical Methodology, Safety, and Stereotactic Application Accuracy in 500 Procedures [J].
Cardinale, Francesco ;
Cossu, Massimo ;
Castana, Laura ;
Casaceli, Giuseppe ;
Schiariti, Marco Paolo ;
Miserocchi, Anna ;
Fuschillo, Dalila ;
Moscato, Alessio ;
Caborni, Chiara ;
Arnulfo, Gabriele ;
Lo Russo, Giorgio .
NEUROSURGERY, 2013, 72 (03) :353-366
[8]   Neural correlates of verbal feedback processing: An fMRl study employing overt speech [J].
Christoffels, Ingrid K. ;
Formisano, Elia ;
Schiller, Niels O. .
HUMAN BRAIN MAPPING, 2007, 28 (09) :868-879
[9]   The Sensory Consequences of Speaking: Parametric Neural Cancellation during Speech in Auditory Cortex [J].
Christoffels, Ingrid K. ;
van de Ven, Vincent ;
Waldorp, Lourens J. ;
Formisano, Elia ;
Schiller, Niels O. .
PLOS ONE, 2011, 6 (05)
[10]   Electrocorticographic gamma activity during word production in spoken and sign language [J].
Crone, NE ;
Hao, L ;
Hart, J ;
Boatman, D ;
Lesser, RP ;
Irizarry, R ;
Gordon, B .
NEUROLOGY, 2001, 57 (11) :2045-2053