Towards Naturalistic Speech Decoding from Intracranial Brain Data

被引：1

作者：

Berezutskaya, Julia ^{[1
]}

Ambrogioni, Luca ^{[1
]}

Ramsey, Nicolas F. ^{[2
]}

van Gerven, Marcel A. J. ^{[1
]}

机构：

[1] Radboud Univ Nijmegen, Donders Inst Brain Cognit & Behav, Thomas van Aquinostr 4, NL-6525 GD Nijmegen, Netherlands

[2] Univ Med Ctr Utrecht, Brain Ctr, Dept Neurol & Neurosurg, Heidelberglaan 100, NL-3584 CX Utrecht, Netherlands

来源：

2022 44TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY, EMBC | 2022年

基金：

欧洲研究理事会;

关键词：

D O I：

10.1109/EMBC48229.2022.9871301

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Speech decoding from brain activity can enable development of brain-computer interfaces (BCIs) to restore naturalistic communication in paralyzed patients. Previous work has focused on development of decoding models from isolated speech data with a clean background and multiple repetitions of the material. In this study, we describe a novel approach to speech decoding that relies on a generative adversarial neural network (GAN) to reconstruct speech from brain data recorded during a naturalistic speech listening task (watching a movie). We compared the GAN-based approach, where reconstruction was done from the compressed latent representation of sound decoded from the brain, with several baseline models that reconstructed sound spectrogram directly. We show that the novel approach provides more accurate reconstructions compared to the baselines. These results underscore the potential of GAN models for speech decoding in naturalistic noisy environments and further advancing of BCIs for naturalistic communication.

引用

页码：3100 / 3104

页数：5

共 30 条

[1] Towards reconstructing intelligible speech from the human auditory cortex [J].