Auditory Attention Decoding in Four-Talker Environment with EEG

被引:0
作者
Yan, Yujie [2 ,3 ]
Xu, Xiran [1 ,3 ]
Zhu, Haolin [1 ,3 ]
Tian, Pei [1 ]
Ge, Zhongshu [1 ]
Wu, Xihong [1 ,3 ]
Chen, Jing [1 ,2 ,3 ]
机构
[1] Peking Univ, Speech & Hearing Res Ctr, Sch Intelligence Sci & Technol, Beijing, Peoples R China
[2] Peking Univ, Coll Future Technol, Natl Biomed Imaging Ctr, Beijing, Peoples R China
[3] Natl Key Lab Gen Artificial Intelligence, Beijing, Peoples R China
来源
INTERSPEECH 2024 | 2024年
基金
中国国家自然科学基金;
关键词
auditory attention decoding; auditory spatial attention detection; stimulus reconstruction; temporal response functions; EEG; DenseNet; SPEECH; TRACKING;
D O I
10.21437/Interspeech.2024-739
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Auditory Attention Decoding (AAD) is a technique that determines the focus of a listener's attention in complex auditory scenes according to cortical neural responses. Existing research largely examines two-talker scenarios, insufficient for real-world complexity. This study introduced a new AAD database for a four-talker scenario with speeches from four distinct talkers simultaneously presented and spatially separated, and listeners' EEG was recorded. Temporal response functions (TRFs) analysis showed that attended speech TRFs are stronger than each unattended speech. AAD methods based on stimulus-reconstruction (SR) and cortical spatial lateralization were employed and compared. Results indicated decoding accuracy of 77.5% in 60s (chance level of 25%) using SR. Using auditory spatial attention detection (ASAD) methods also indicated high accuracy (94.7% with DenseNet-3D in 1s), demonstrating ASAD methods' generalization performance.
引用
收藏
页码:432 / 436
页数:5
相关论文
共 28 条
[1]   Where is the cocktail party? Decoding locations of attended and unattended moving sound sources using EEG [J].
Bednar, Adam ;
Lalor, Edmund C. .
NEUROIMAGE, 2020, 205
[2]   Auditory-Inspired Speech Envelope Extraction Methods for Improved EEG-Based Auditory Attention Detection in a Cocktail Party Scenario [J].
Biesmans, Wouter ;
Das, Neetha ;
Francart, Tom ;
Bertrand, Alexander .
IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2017, 25 (05) :402-412
[3]   Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J].
Carreira, Joao ;
Zisserman, Andrew .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4724-4733
[4]   Brain-informed speech separation (BISS) for enhancement of target speaker in multitalker speech perception [J].
Ceolini, Enea ;
Hjortkjaer, Jens ;
De Wong, Daniel ;
O'Sullivan, James ;
Raghavan, Vinay S. ;
Herrero, Jose ;
Mehta, Ashesh D. ;
Liu, Shih-Chii ;
Mesgarani, Nima .
NEUROIMAGE, 2020, 223
[6]   The Multivariate Temporal Response Function (mTRF) Toolbox: A MATLAB Toolbox for Relating Neural Signals to Continuous Stimuli [J].
Crosse, Michael J. ;
Di Liberto, Giovanni M. ;
Bednar, Adam ;
Lalor, Edmund C. .
FRONTIERS IN HUMAN NEUROSCIENCE, 2016, 10
[7]   EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis [J].
Delorme, A ;
Makeig, S .
JOURNAL OF NEUROSCIENCE METHODS, 2004, 134 (01) :9-21
[8]   Emergence of neural encoding of auditory objects while listening to competing speakers [J].
Ding, Nai ;
Simon, Jonathan Z. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2012, 109 (29) :11854-11859
[9]   Congruent Audiovisual Speech Enhances Cortical Envelope Tracking during Auditory Selective Attention [J].
Fu, Zhen ;
Chen, Jing .
INTERSPEECH 2020, 2020, :116-120
[10]   Congruent audiovisual speech enhances auditory attention decoding with EEG [J].
Fu, Zhen ;
Wu, Xihong ;
Chen, Jing .
JOURNAL OF NEURAL ENGINEERING, 2019, 16 (06)