Behavioral Account of Attended Stream Enhances Neural Tracking

被引：3

作者：

Huet, Moira-Phoebe ^{[1
,2
]}

Micheyl, Christophe ^{[3
]}

Parizet, Etienne ^{[1
]}

Gaudrain, Etienne ^{[2
,4
]}

机构：

[1] Univ Lyon, Inst Natl Sci Appl Lyon, Lab Vibrat Acoust, Villeurbanne, France

[2] INSERM, CNRS, Auditory Cognit & Psychoacoust Team, Lyon Neurosci Res Ctr,UMR 5292,U1028, Lyon, France

[3] Starkey France Sarl, Creteil, France

[4] Univ Groningen, Univ Med Ctr Groningen, Dept Otorhinolaryngol, Groningen, Netherlands

来源：

FRONTIERS IN NEUROSCIENCE | 2021年 / 15卷

关键词：

neural tracking; attentional switches; temporal response function (TRF); speech-on-speech; vocal cues; VOCAL-TRACT LENGTH; AUDITORY ATTENTION; SPEECH; SPEAKER; NOISE; MEG; INTELLIGIBILITY; ENVIRONMENT; PERCEPTION; EXTRACTION;

D O I：

10.3389/fnins.2021.674112

中图分类号：

Q189 [神经科学];

学科分类号：

071006 ;

摘要：

During the past decade, several studies have identified electroencephalographic (EEG) correlates of selective auditory attention to speech. In these studies, typically, listeners are instructed to focus on one of two concurrent speech streams (the "target"), while ignoring the other (the "masker"). EEG signals are recorded while participants are performing this task, and subsequently analyzed to recover the attended stream. An assumption often made in these studies is that the participant's attention can remain focused on the target throughout the test. To check this assumption, and assess when a participant's attention in a concurrent speech listening task was directed toward the target, the masker, or neither, we designed a behavioral listen-then-recall task (the Long-SWoRD test). After listening to two simultaneous short stories, participants had to identify keywords from the target story, randomly interspersed among words from the masker story and words from neither story, on a computer screen. To modulate task difficulty, and hence, the likelihood of attentional switches, masker stories were originally uttered by the same talker as the target stories. The masker voice parameters were then manipulated to parametrically control the similarity of the two streams, from clearly dissimilar to almost identical. While participants listened to the stories, EEG signals were measured and subsequently, analyzed using a temporal response function (TRF) model to reconstruct the speech stimuli. Responses in the behavioral recall task were used to infer, retrospectively, when attention was directed toward the target, the masker, or neither. During the model-training phase, the results of these behavioral-data-driven inferences were used as inputs to the model in addition to the EEG signals, to determine if this additional information would improve stimulus reconstruction accuracy, relative to performance of models trained under the assumption that the listener's attention was unwaveringly focused on the target. Results from 21 participants show that information regarding the actual - as opposed to, assumed - attentional focus can be used advantageously during model training, to enhance subsequent (test phase) accuracy of auditory stimulus-reconstruction based on EEG signals. This is the case, especially, in challenging listening situations, where the participants' attention is less likely to remain focused entirely on the target talker. In situations where the two competing voices are clearly distinct and easily separated perceptually, the assumption that listeners are able to stay focused on the target is reasonable. The behavioral recall protocol introduced here provides experimenters with a means to behaviorally track fluctuations in auditory selective attention, including, in combined behavioral/neurophysiological studies.

引用

页数：13

共 44 条

[11] SOME EXPERIMENTS ON THE RECOGNITION OF SPEECH, WITH ONE AND WITH 2 EARS [J].

CHERRY, EC .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1953, 25 (05) :975-979

[12] The Multivariate Temporal Response Function (mTRF) Toolbox: A MATLAB Toolbox for Relating Neural Signals to Continuous Stimuli [J].

Crosse, Michael J. ;

Di Liberto, Giovanni M. ;

Bednar, Adam ;

Lalor, Edmund C. .

FRONTIERS IN HUMAN NEUROSCIENCE, 2016, 10

[13] Congruent Visual Speech Enhances Cortical Entrainment to Continuous Auditory Speech in Noise-Free Conditions [J].

Crosse, Michael J. ;

Butler, John S. ;

Lalor, Edmund C. .

JOURNAL OF NEUROSCIENCE, 2015, 35 (42) :14195-14204

[14] Effects of fundamental frequency and vocal-tract length changes on attention to one of two simultaneous talkers [J].

Darwin, CJ ;

Brungart, DS ;

Simpson, BD .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2003, 114 (05) :2913-2922

[15] Emergence of neural encoding of auditory objects while listening to competing speakers [J].

Ding, Nai ;

Simon, Jonathan Z. .

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2012, 109 (29) :11854-11859

[16] Does good perception of vocal characteristics relate to better speech-on-speech intelligibility for cochlear implant users? [J].

El Boghdady, Nawal ;

Gaudrain, Etienne ;

Baskent, Deniz .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2019, 145 (01) :417-439

[17]

Enders G., 2015, CHARME DISCRET INTES

[18] EFFECTS OF FLUCTUATING NOISE AND INTERFERING SPEECH ON THE SPEECH-RECEPTION THRESHOLD FOR IMPAIRED AND NORMAL HEARING [J].

FESTEN, JM ;

PLOMP, R .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1990, 88 (04) :1725-1736

[19] Discrimination of Voice Pitch and Vocal-Tract Length in Cochlear Implant Users [J].

Gaudrain, Etienne ;

Baskent, Deniz .

EAR AND HEARING, 2018, 39 (02) :226-237

[20] DERIVATION OF AUDITORY FILTER SHAPES FROM NOTCHED-NOISE DATA [J].

GLASBERG, BR ;

MOORE, BCJ .

HEARING RESEARCH, 1990, 47 (1-2) :103-138

← 1 2 3 4 5 →