Attention and Sequence Modeling for Match-Mismatch Classification of Speech Stimulus and EEG Response

被引：1

作者：

Borsdorf, Marvin ^{[2
]}

Cai, Siqi ^{[2
]}

Pahuja, Saurav ^{[2
]}

De Silva, Dashanka ^{[2
]}

Li, Haizhou ^{[1
,2
,3
]}

Schultz, Tanja ^{[4
]}

机构：

[1] Chinese Univ Hong Kong, Shenzhen Res Inst Big Data, Sch Data Sci, Shenzhen 518172, Peoples R China

[2] Univ Bremen, Machine Listening Lab, D-28359 Bremen, Germany

[3] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore 119077, Singapore

[4] Univ Bremen, Cognit Syst Lab, D-28359 Bremen, Germany

来源：

IEEE OPEN JOURNAL OF SIGNAL PROCESSING | 2024年 / 5卷

关键词：

Auditory system; EEG decoding; match-mismatch classification; speech envelope; speech stimulus; SPEAKER EXTRACTION; AUDITORY ATTENTION; NEURAL-NETWORK; BRAIN; LSTM; TRANSFORMER; ENVIRONMENT; PERCEPTION;

D O I：

10.1109/OJSP.2023.3340063

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

For the development of neuro-steered hearing aids, it is important to study the relationship between a speech stimulus and the elicited EEG response of a human listener. The recent Auditory EEG Decoding Challenge 2023 (Signal Processing Grand Challenge, IEEE International Conference on Acoustics, Speech and Signal Processing) dealt with this relationship in the context of a match-mismatch classification task. The challenge's task was to find the speech stimulus that elicited a specific EEG response from two given speech stimuli. Participating in the challenge, we adopted the challenge's baseline model and explored an attention encoder to replace the spatial convolution in the EEG processing pipeline, as well as additional sequence modeling methods based on RNN, LSTM, and GRU to preprocess the speech stimuli. We compared speech envelopes and mel-spectrograms as two different types of input speech stimulus and evaluated our models on a test set as well as held-out stories and held-out subjects benchmark sets. In this work, we show that the mel-spectrograms generally offer better results. Replacing the spatial convolution with an attention encoder helps to capture better spatial and temporal information in the EEG response. Additionally, the sequence modeling methods can further enhance the performance, when mel-spectrograms are used. Consequently, both lead to higher performances on the test set and held-out stories benchmark set. Our best model outperforms the baseline by 1.91% on the test set and 1.35% on the total ranking score. We ranked second in the challenge.

引用

页码：799 / 809

页数：11

共 62 条

[1] A comprehensive review of EEG-based brain-computer interface paradigms [J].

Abiri, Reza ;

Borhani, Soheil ;

Sellers, Eric W. ;

Jiang, Yang ;

Zhao, Xiaopeng .

JOURNAL OF NEURAL ENGINEERING, 2019, 16 (01)

[2]

Accou B., 2023, bioRxiv

[3]

Accou B, 2021, EUR SIGNAL PR CONF, P1175, DOI [10.23919/eusipco47968.2020.9287417, 10.23919/Eusipco47968.2020.9287417]

[4] Decoding of the speech envelope from EEG using the VLAAI deep neural network [J].

Accou, Bernd ;

Vanthornhout, Jonas ;

Van Hamme, Hugo ;

Francart, Tom .

SCIENTIFIC REPORTS, 2023, 13 (01)

[5] Robust decoding of selective auditory attention from MEG in a competing-speaker environment via state-space modeling [J].

Akram, Sahar ;

Presacco, Alessandro ;

Simon, Jonathan Z. ;

Shamma, Shihab A. ;

Babadi, Behtash .

NEUROIMAGE, 2016, 124 :906-917

[6] EEG-ConvTransformer for single-trial EEG-based visual stimulus classification [J].

Bagchi, Subhranil ;

Bathula, Deepti R. .

PATTERN RECOGNITION, 2022, 129

[7] Auditory-Inspired Speech Envelope Extraction Methods for Improved EEG-Based Auditory Attention Detection in a Cocktail Party Scenario [J].

Biesmans, Wouter ;

Das, Neetha ;

Francart, Tom ;

Bertrand, Alexander .

IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2017, 25 (05) :402-412

[8]

Borsdorf M., 2023, P IEEE INT C AC SPEE, P1

[9]

Bronkhorst AW, 2000, ACUSTICA, V86, P117

[10] EEG-Based Auditory Attention Detection via Frequency and Channel Neural Attention [J].

Cai, Siqi ;

Su, Enze ;

Xie, Longhan ;

Li, Haizhou .

IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, 2022, 52 (02) :256-266

← 1 2 3 4 5 6 7 →