Speech intelligibility improvement in car noise environment by voice transformation

被引：16

作者：

Nathwani, Karan ^{[1
]}

Richard, Gael ^{[1
]}

David, Bertrand ^{[1
]}

Prablanc, Pierre ^{[2
]}

Roussarie, Vincent ^{[2
]}

机构：

[1] Univ Paris Saclay, Telecom ParisTech, LTCI, F-75013 Paris, France

[2] PSA Peugeot Citroen, Chemin Gisy, F-78943 Velizy Villacoublay, France

来源：

SPEECH COMMUNICATION | 2017年 / 91卷

关键词：

Speech intelligibility; Car noise environment; Hearing in noise test; Voice transformation; Lombard speech; FUNDAMENTAL-FREQUENCY; ENHANCEMENT; CLEAR; HEARING;

D O I：

10.1016/j.specom.2017.04.007

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The typical application targeted by this work is the intelligibility improvement of speech messages when rendered in car noise environment (radio, message alerts,...). The main idea of this work is to transform the original speech to "Lombard" speech or more precisely to simulate some of the strategies followed by humans to render their speech clearer when they are surrounded by noise. Three main effects are considered in this work, namely non uniform-time scale modification, formant shifting and a combination of these modifications along with energy redistribution between speech regions. All effects are studied with specific transformations for voiced and unvoiced segments. The proposed modifications are then evaluated by means of subjective and objective tests. The results of these tests conducted with normal hearing and impaired listeners demonstrate the potential of the selected transformations for voice intelligibility improvement. (C) 2017 Elsevier B.V. All rights reserved.

引用

页码：17 / 27

页数：11

共 51 条

[1] Amano-Kusumoto A., 2011, Tech. Rep. CSLU-011-001
[2] [Anonymous], 2013, COMPUT REV
[3] Effects of suppressing steady-state portions of speech on intelligibility in reverberant environments
Arai, Takayuki
Kinoshita, Keisuke
Hodoshima, Nao
Kusumoto, Akiko
Kitamura, Tomoko
[J]. Acoustical Science and Technology, 2002, 23 (04) : 229 - 232
[4] Modelling speaker intelligibility in noise
Barker, Jon
Cooke, Martin
[J]. SPEECH COMMUNICATION, 2007, 49 (05) : 402 - 417
[5] A NOTE ON THE ACOUSTIC-PHONETIC CHARACTERISTICS OF INADVERTENTLY CLEAR SPEECH
BOND, ZS
MOORE, TJ
[J]. SPEECH COMMUNICATION, 1994, 14 (04) : 325 - 337
[6] Brown R.G., 1959, Statistical forecasting for inventory control
[7] CND, 2015, NAT COLL AUD SPEECH
[8] Cooke M., 2013, P ANN C INT SPEECH C, P3552
[9] VOICE CONVERSION USING ARTIFICIAL NEURAL NETWORKS
Desai, Srinivas
Raghavendra, E. Veera
Yegnanarayana, B.
Black, Alan W.
Prahallad, Kishore
[J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3893 - +
[10] Time-scale modification of speech signals, for language-learning impaired children
Erogul, O
Karagoz, I
[J]. PROCEEDINGS OF THE 1998 2ND INTERNATIONAL CONFERENCE BIOMEDICAL ENGINEERING DAYS, 1998, : 33 - 35

← 1 2 3 4 5 6 →