Relationship between contributions of temporal amplitude envelope of speech and modulation transfer function in room acoustics to perception of noise-vocoded speech

被引:13
作者
Unoki, Masashi [1 ]
Zhu, Zhi [1 ]
机构
[1] Japan Adv Inst Sci & Technol, Sch Informat Sci, 1-1 Asahidai, Nomi 9231292, Japan
关键词
Temporal amplitude envelope; Modulation transfer function; Noise-vocoded speech; Vocal-emotion recognition; Speech intelligibility; Temporal modulation-spectral feature; VOCAL-EMOTION; SPEAKER INDIVIDUALITY; LISTENING DIFFICULTY; OBJECTIVE MEASURES; SPECTRAL FEATURES; RECOGNITION; FREQUENCY; INTELLIGIBILITY; REVERBERANT; INFORMATION;
D O I
10.1250/ast.41.233
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speech signals can be represented as a sum of amplitude-modulated frequency bands. This sum can also be regarded as a temporal amplitude envelope (TAE) with temporal fine structure. Our previous studies using noise-vocoded speech (NVS) showed that the TAE of speech plays an important role in the perception of linguistic information (speech intelligibility) as well as non-linguistic information (e.g., vocal-emotion recognition). It was found that the upper limit of the modulation frequency from 4 to 8 Hz on the TAE is important for speech intelligibility, while that from 8 to 16 Hz is important for vocal-emotion recognition. However, speech intelligibility generally dramatically degrades due to reverberation. The concept of the modulation transfer function (MTF) takes into account the relationship between the transfer function in an enclosure in terms of input and output TAEs and characteristics of the enclosure under reverberant conditions. This concept was introduced as a measure in room acoustics for assessing the effect of an enclosure on speech intelligibility. For this study, we conducted two experiments involving word intelligibility tests and vocal-emotion recognition with NVS under reverberant conditions to investigate the relationship between the contributions of the TAE of speech and MTF of reverberation to modulation perception of NVS. We also pointed out that the straightforward scheme, i.e., the relationship between the contributions of the static features (peak/slope) in the modulation spectrum (MS) of speech and MTF of reverberation, cannot consistently account for the auditory perception of both linguistic and non-linguistic information obtained from these perceptual data of NVS under reverberant conditions. We then developed a scheme in which the relationship between the contributions of the temporal MS features and MTF of reverberation to modulation perception can consistently account for these perceptual data of NVS.
引用
收藏
页码:233 / 244
页数:12
相关论文
共 30 条
  • [21] Evaluating the role of age on speech-in-noise perception based primarily on temporal envelope information
    Regev, Jonathan
    Oxenham, Andrew J.
    Relano-Iborra, Helia
    Zaar, Johannes
    Dau, Torsten
    HEARING RESEARCH, 2025, 460
  • [22] Relationship Between Speech Perception in Noise and Phonological Awareness Skills for Children With Normal Hearing
    Lewis, Dawna
    Hoover, Brenda
    Choi, Sangsook
    Stelmachowicz, Patricia
    EAR AND HEARING, 2010, 31 (06) : 761 - 768
  • [23] Contributions of cerebellar event-based temporal processing and preparatory function to speech perception
    Schwartze, Michael
    Kotz, Sonja A.
    BRAIN AND LANGUAGE, 2016, 161 : 28 - 32
  • [24] Improved tactile speech perception and noise robustness using audio-to-tactile sensory substitution with amplitude envelope expansion
    Fletcher, Mark D.
    Akis, Esma
    Verschuur, Carl A.
    Perry, Samuel W.
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [25] The relationship between speech in noise perception and auditory working memory capacity in monolingual and bilingual adults
    Shokuhifar, Ghazaal
    Javanbakht, Mohanna
    Vahedi, Mohsen
    Mehrkian, Saeideh
    Aghadoost, Alireza
    INTERNATIONAL JOURNAL OF AUDIOLOGY, 2025, 64 (02) : 131 - 138
  • [26] Controlled (re)evaluation of the relationship between speech perception in noise and contralateral suppression of otoacoustic emissions
    Shaikh, Mohsin Ahmed
    Connell, Kylie
    Zhang, Dong
    HEARING RESEARCH, 2021, 409
  • [27] Relationships Between the Auditory Nerve Sensitivity to Amplitude Modulation, Perceptual Amplitude Modulation Rate Discrimination Sensitivity, and Speech Perception Performance in Postlingually Deafened Adult Cochlear Implant Users
    He, Shuman
    Skidmore, Jeffrey
    Koch, Brandon
    Chatterjee, Monita
    Carter, Brittney L.
    Yuan, Yi
    EAR AND HEARING, 2023, 44 (02) : 371 - 384
  • [28] The Relationship between Speech Perception in Quiet and in Noise for Young Adults with Pure-Tone Thresholds ≤ 25 dB HL
    Vermiglio, Andrew J.
    Osborne, Hannah R.
    Bonilla, Elizabeth
    Leclerc, Lauren
    Thornton, Meagan
    Fang, Xiangming
    JOURNAL OF THE AMERICAN ACADEMY OF AUDIOLOGY, 2022, 33 (7-8) : 390 - 395
  • [29] The relationship between auditory brainstem responses, cognitive ability, and speech-in-noise perception among young adults with normal hearing thresholds
    Dinino, Mishaela
    Crowell, Jenna
    Kloiber, Ilsa
    Polonenko, Melissa J.
    HEARING RESEARCH, 2025, 460
  • [30] Assessing Spectral and Temporal Processing in Children and Adults Using Temporal Modulation Transfer Function (TMTF), Iterated Ripple Noise (IRN) Perception, and Spectral Ripple Discrimination (SRD)
    Peter, Varghese
    Wong, Kogo
    Narne, Vijaya Kumar
    Sharma, Mridula
    Purdy, Suzanne C.
    McMahon, Catherine
    JOURNAL OF THE AMERICAN ACADEMY OF AUDIOLOGY, 2014, 25 (02) : 210 - 218