Directional dependency of cepstrum on vocal tract length

被引:2
|
作者
Saito, Daisuke [1 ]
Matsuura, Ryo [1 ]
Asakawa, Satoshi [1 ]
Minematsu, Nobuaki [1 ]
Hirose, Keikichi [2 ]
机构
[1] Univ Tokyo, Grad Sch Frontier Sci, Tokyo 1138654, Japan
[2] Univ Tokyo, Grad Sch Informat Sci & Technol, Tokyo 1138654, Japan
来源
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12 | 2008年
关键词
frequency warping; cepstrum; rotation; matrix; vocal tract length;
D O I
10.1109/ICASSP.2008.4518652
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we prove that the direction of cepstrum vectors strongly depends on vocal tract length and that this dependency is represented as rotation in the n dimensional cepstrum space. In speech recognition studies, vocal tract length normalization (VTLN) techniques are widely used to cancel age- and gender-differences. In VTLN, a frequency warping is often carried out and it can be implemented as a linear transformation in a cepstrum space; (c) over cap = Ac. However, the geometric properties of this transformation matrix A have not been well discussed. In this study, its properties are made clear using n dimensional geometry and it is shown that the matrix rotates any cepstrum vector similarly and apparently. Experimental results using resynthesized speech demonstrate that cepstrum vectors extracted from a speaker of 180 [cm] in height and those from another speaker of 120 [cm] in height are reasonably orthogonal. This result makes clear one of the reasons why children's speech is very difficult for conventional speech recognizers to deal with adequately.
引用
收藏
页码:4485 / +
页数:2
相关论文
共 50 条
  • [1] CALCULATION OF VOCAL TRACT LENGTH
    PAIGE, A
    ZUE, VW
    IEEE TRANSACTIONS ON AUDIO AND ELECTROACOUSTICS, 1970, AU18 (03): : 268 - &
  • [2] Vocal-tract length estimation
    Sorokin, V. N.
    Geras'kin, I. V.
    JOURNAL OF COMMUNICATIONS TECHNOLOGY AND ELECTRONICS, 2013, 58 (12) : 1292 - 1301
  • [3] Vocal-tract length estimation
    V. N. Sorokin
    I. V. Geras’kin
    Journal of Communications Technology and Electronics, 2013, 58 : 1292 - 1301
  • [4] A parametric approach to vocal tract length normalization
    Eide, E
    Gish, H
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 346 - 348
  • [5] CONTROL OF VOCAL-TRACT LENGTH IN SPEECH
    RIORDAN, C
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1976, 60 : S44 - S44
  • [6] Vocal Tract Length during Speech Production
    Dusan, Sorin
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 13 - 16
  • [7] CONTROL OF VOCAL-TRACT LENGTH IN SPEECH
    RIORDAN, CJ
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1977, 62 (04): : 998 - 1002
  • [8] Time domain vocal tract length normalization
    Sündermann, D
    Bonafonte, A
    Ney, H
    Hoge, H
    Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, 2004, : 191 - 194
  • [9] POSTERIORI ESTIMATION OF VOCAL-TRACT LENGTH
    KIRLIN, RL
    IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1978, 26 (06): : 571 - 574
  • [10] Parameter optimization for Vocal Tract Length Normalization
    Dognin, P
    El-Jaroudi, A
    Billa, J
    2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1767 - 1770