Decoding and synthesizing tonal language speech from brain activity

被引:13
|
作者
Liu, Yan [1 ,2 ,3 ,4 ]
Zhao, Zehao [1 ,2 ,3 ,4 ]
Xu, Minpeng [1 ,2 ,4 ,5 ,6 ]
Yu, Haiqing [1 ,2 ,4 ,5 ]
Zhu, Yanming [1 ,2 ,3 ,4 ]
Zhang, Jie [1 ,2 ,3 ,4 ]
Bu, Linghao [1 ,2 ,3 ,4 ,7 ]
Zhang, Xiaoluo [1 ,2 ,3 ,4 ]
Lu, Junfeng [1 ,2 ,3 ,4 ,8 ]
Li, Yuanning [1 ,2 ,4 ,9 ]
Ming, Dong [1 ,2 ,4 ,5 ,6 ]
Wu, Jinsong [1 ,2 ,3 ,4 ]
机构
[1] Fudan Univ, Huashan Hosp, Shanghai Med Coll, Dept Neurosurg, Shanghai 200040, Peoples R China
[2] Natl Ctr Neurol Disorders, Shanghai 200052, Peoples R China
[3] Shanghai Key Lab Brain Funct Restorat & Neural Reg, Shanghai 200040, Peoples R China
[4] Fudan Univ, Neurosurg Inst, Shanghai 200052, Peoples R China
[5] Tianjin Univ, Coll Precis Instruments & Optoelect Engn, Dept Biomed Engn, Tianjin 300041, Peoples R China
[6] Tianjin Univ, Acad Med Engn & Translat Med, Tianjin 300041, Peoples R China
[7] Zhejiang Univ, Affiliated Hosp 1, Coll Med, Dept Neurosurg, Hangzhou 310000, Peoples R China
[8] Fudan Univ, MOE Frontiers Ctr Brain Sci, Huashan Hosp, Shanghai 200040, Peoples R China
[9] ShanghaiTech Univ, Sch Biomed Engn, Shanghai 201210, Peoples R China
来源
SCIENCE ADVANCES | 2023年 / 9卷 / 23期
关键词
HUMAN SENSORIMOTOR CORTEX; SPOKEN;
D O I
10.1126/sciadv.adh0478
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Recent studies have shown that the feasibility of speech brain-computer interfaces (BCIs) as a clinically valid treatment in helping nontonal language patients with communication disorders restore their speech ability. However, tonal language speech BCI is challenging because additional precise control of laryngeal movements to produce lexical tones is required. Thus, the model should emphasize the features from the tonal-related cortex. Here, we designed a modularized multistream neural network that directly synthesizes tonal language speech from intracranial recordings. The network decoded lexical tones and base syllables independently via parallel streams of neural network modules inspired by neuroscience findings. The speech was synthesized by combining tonal syllable labels with nondiscriminant speech neural activity. Compared to commonly used baseline models, our proposed models achieved higher performance with modest training data and computational costs. These findings raise a potential strategy for approaching tonal language speech restoration.
引用
收藏
页数:11
相关论文
empty
未找到相关数据