Electrolaryngeal Speech Enhancement Based on Statistical Voice Conversion

被引:0
作者
Nakamura, Keigo [1 ]
Toda, Tomoki [1 ]
Saruwatari, Hiroshi [1 ]
Shikano, Kiyohiro [1 ]
机构
[1] Nara Inst Sci & Technol, Grad Sch Informat Sci, Nara, Japan
来源
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5 | 2009年
关键词
Electrolarynx; Laryngectomee; Voice conversion; Speaking-aid;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a speaking-aid system for laryngectomees using GMM-based voice conversion that converts electrolaryngeal speech (EL speech) to normal speech. Because valid Po information cannot be obtained from the EL speech, we have so far converted the EL speech to whispering. This paper conducts the EL speech conversion to normal speech using Fo counters estimated from the spectral information of the EL speech. In this paper, we experimentally evaluate these two types of output speech of our speaking-aid system from several points of view. The experimental results demonstrate that the converted normal speech is preferred to the converted whisper.
引用
收藏
页码:1443 / 1446
页数:4
相关论文
共 10 条
  • [1] HASHIBA M, 2001, IEICE T D 2, V94, P1240
  • [2] Kain A, 1998, INT CONF ACOUST SPEE, P285, DOI 10.1109/ICASSP.1998.674423
  • [3] Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction:: Possible role of a repetitive structure in sounds
    Kawahara, H
    Masuda-Katsuse, I
    de Cheveigné, A
    [J]. SPEECH COMMUNICATION, 1999, 27 (3-4) : 187 - 207
  • [4] MURAKAMI K, 2004, IEICE T D 1, V87, P1030
  • [5] Nakagiri M, 2006, INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, P2270
  • [6] NAKAMURA K, 2008, EVALUATION SPEAKING, P2209
  • [7] Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
    Toda, Tomoki
    Black, Alan W.
    Tokuda, Keiichi
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (08): : 2222 - 2235
  • [8] Toda T, 2007, INT CONF ACOUST SPEE, P1249
  • [9] Toda Tomoki, 2005, PROC INTERSPEECH 200, P1957
  • [10] Tokuda K, 2000, INT CONF ACOUST SPEE, P1315, DOI 10.1109/ICASSP.2000.861820