Auditory-Inspired Speech Envelope Extraction Methods for Improved EEG-Based Auditory Attention Detection in a Cocktail Party Scenario

被引:156
作者
Biesmans, Wouter [1 ]
Das, Neetha [1 ,2 ,3 ]
Francart, Tom [2 ]
Bertrand, Alexander [1 ,3 ]
机构
[1] Katholieke Univ Leuven, Dept Elect Engn ESAT, Stadius Ctr Dynam Syst Signal Proc & Data Analyt, B-3001 Leuven, Belgium
[2] Katholieke Univ Leuven, ExpORL, Dept Neurosci, B-3000 Leuven, Belgium
[3] iMinds Med IT, Leuven, Belgium
关键词
Auditory attention; auditorymodels; cocktail party; electroencephalography(EEG) processing; neuro-steered auditory prostheses; speech envelope; ATTENDED SPEECH; ENVIRONMENT; SPEAKER;
D O I
10.1109/TNSRE.2016.2571900
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
This paper considers the auditory attention detection (AAD) paradigm, where the goal is to determine which of two simultaneous speakers a person is attending to. The paradigm relies on recordings of the listener's brain activity, e.g., from electroencephalography (EEG). To perform AAD, decoded EEG signals are typically correlated with the temporal envelopes of the speech signals of the separate speakers. In this paper, we study how the inclusion of various degrees of auditory modelling in this speech envelope extraction process affects the AAD performance, where the best performance is found for an auditory-inspired linear filter bank followed by power law compression. These two modelling stages are computationally cheap, which is important for implementation in wearable devices, such as future neuro-steered auditory prostheses. We also introduce a more natural way to combine recordings (over trials and subjects) to train the decoder, which reduces the dependence of the algorithm on regularization parameters. Finally, we investigate the simultaneous design of the EEG decoder and the audio subband envelope recombination weights vector using either a norm-constrained least squares or a canonical correlation analysis, but conclude that this increases computational complexity without improving AAD performance.
引用
收藏
页码:402 / 412
页数:11
相关论文
共 35 条
[1]   Human cortical responses to the speech envelope [J].
Aiken, Steven J. ;
Picton, Terence W. .
EAR AND HEARING, 2008, 29 (02) :139-157
[2]   Robust decoding of selective auditory attention from MEG in a competing-speaker environment via state-space modeling [J].
Akram, Sahar ;
Presacco, Alessandro ;
Simon, Jonathan Z. ;
Shamma, Shihab A. ;
Babadi, Behtash .
NEUROIMAGE, 2016, 124 :906-917
[3]  
[Anonymous], 1997, AS351997
[4]  
[Anonymous], 2007, RAD KIND
[5]  
Aroudi A., 2016, P IEEE INT C AC SPEE
[6]   Distributed Signal Processing for Wireless EEG Sensor Networks [J].
Bertrand, Alexander .
IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2015, 23 (06) :923-935
[7]   Blind separation of non-negative source signals using multiplicative updates and subspace projection [J].
Bertrand, Alexander ;
Moonen, Marc .
SIGNAL PROCESSING, 2010, 90 (10) :2877-2890
[8]  
Biesmans W., 2015, P 37 ANN INT IEEE C
[9]   Exploring miniaturized EEG electrodes for brain-computer interfaces. An EEG you do not see? [J].
Bleichner, Martin G. ;
Lundbeck, Micha ;
Selisky, Matthias ;
Minow, Falk ;
Jaeger, Manuela ;
Emkes, Reiner ;
Debener, Stefan ;
De Vos, Maarten .
PHYSIOLOGICAL REPORTS, 2015, 3 (04)
[10]   Wearable EEG: what is it, why is it needed and what does it entail? [J].
Casson, Alexander J. ;
Smith, Shelagh ;
Duncan, John S. ;
Rodriguez-Villegas, Esther .
2008 30TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-8, 2008, :5867-+