Speech intelligibility improvement in car noise environment by voice transformation

被引:16
作者
Nathwani, Karan [1 ]
Richard, Gael [1 ]
David, Bertrand [1 ]
Prablanc, Pierre [2 ]
Roussarie, Vincent [2 ]
机构
[1] Univ Paris Saclay, Telecom ParisTech, LTCI, F-75013 Paris, France
[2] PSA Peugeot Citroen, Chemin Gisy, F-78943 Velizy Villacoublay, France
关键词
Speech intelligibility; Car noise environment; Hearing in noise test; Voice transformation; Lombard speech; FUNDAMENTAL-FREQUENCY; ENHANCEMENT; CLEAR; HEARING;
D O I
10.1016/j.specom.2017.04.007
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The typical application targeted by this work is the intelligibility improvement of speech messages when rendered in car noise environment (radio, message alerts,...). The main idea of this work is to transform the original speech to "Lombard" speech or more precisely to simulate some of the strategies followed by humans to render their speech clearer when they are surrounded by noise. Three main effects are considered in this work, namely non uniform-time scale modification, formant shifting and a combination of these modifications along with energy redistribution between speech regions. All effects are studied with specific transformations for voiced and unvoiced segments. The proposed modifications are then evaluated by means of subjective and objective tests. The results of these tests conducted with normal hearing and impaired listeners demonstrate the potential of the selected transformations for voice intelligibility improvement. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:17 / 27
页数:11
相关论文
共 51 条
  • [1] Amano-Kusumoto A., 2011, Tech. Rep. CSLU-011-001
  • [2] [Anonymous], 2013, COMPUT REV
  • [3] Effects of suppressing steady-state portions of speech on intelligibility in reverberant environments
    Arai, Takayuki
    Kinoshita, Keisuke
    Hodoshima, Nao
    Kusumoto, Akiko
    Kitamura, Tomoko
    [J]. Acoustical Science and Technology, 2002, 23 (04) : 229 - 232
  • [4] Modelling speaker intelligibility in noise
    Barker, Jon
    Cooke, Martin
    [J]. SPEECH COMMUNICATION, 2007, 49 (05) : 402 - 417
  • [5] A NOTE ON THE ACOUSTIC-PHONETIC CHARACTERISTICS OF INADVERTENTLY CLEAR SPEECH
    BOND, ZS
    MOORE, TJ
    [J]. SPEECH COMMUNICATION, 1994, 14 (04) : 325 - 337
  • [6] Brown R.G., 1959, Statistical forecasting for inventory control
  • [7] CND, 2015, NAT COLL AUD SPEECH
  • [8] Cooke M., 2013, P ANN C INT SPEECH C, P3552
  • [9] VOICE CONVERSION USING ARTIFICIAL NEURAL NETWORKS
    Desai, Srinivas
    Raghavendra, E. Veera
    Yegnanarayana, B.
    Black, Alan W.
    Prahallad, Kishore
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3893 - +
  • [10] Time-scale modification of speech signals, for language-learning impaired children
    Erogul, O
    Karagoz, I
    [J]. PROCEEDINGS OF THE 1998 2ND INTERNATIONAL CONFERENCE BIOMEDICAL ENGINEERING DAYS, 1998, : 33 - 35