Measuring the gap between HMM-based ASR and TTS

被引：0

作者：

Dines, John ^{[1
]}

Yamagishi, Junichi ^{[2
]}

King, Simon ^{[2
]}

机构：

[1] Idiap Res Inst, CH-1920 Martigny, Switzerland

[2] Univ Edinburgh, CSTR, Edinburgh EH8 9AB, Midlothian, Scotland

来源：

INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5 | 2009年

基金：

英国工程与自然科学研究理事会;

关键词：

speech synthesis; speech recognition; unified models;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The EMIME European project is conducting research in the development of technologies for mobile, personalised speech-to-speech translation systems. The hidden Markov model is being used as the underlying technology in both automatic speech recognition (ASR) and text-to-speech synthesis (TTS) components, thus, the investigation of unified statistical modelling approaches has become an implicit goal of our research. As one of the first steps towards this goal, we have been investigating commonalities and differences between HMM-based ASR and TTS. In this paper we present results and analysis of a series of experiments that have been conducted on English ASR and TTS systems measuring their performance with respect to phone set and lexicon, acoustic feature type and dimensionality and HMM topology. Our results show that, although the fundamental statistical model may be essentially the same, optimal ASR and TTS performance often demands diametrically opposed system designs. This represents a major challenge to be addressed in the investigation of such unified modelling approaches.

引用

页码：1411 / +

页数：2

共 50 条

[31] Parameterization of Vocal Fry in HMM-Based Speech Synthesis
Silen, Hanna
Helander, Elina
Nurminen, Jani
Gabbouj, Moncef
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1735 - +
[32] Semi-Supervision in ASR: Sequential MixMatch and Factorized TTS-Based Augmentation
Chen, Zhehuai
Rosenberg, Andrew
Zhang, Yu
Zen, Heiga
Ghodsi, Mohammadreza
Huang, Yinghui
Emond, Jesse
Wang, Gary
Ramabhadran, Bhuvana
Moreno, Pedro J.
INTERSPEECH 2021, 2021, : 736 - 740
[33] HMM-based Tibetan Lhasa Speech Synthesis System
Wu Zhiqiang
Yu Hongzhi
Li Guanyu
Wan Shuhui
2013 3RD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT), 2013, : 92 - 95
[34] Continuous Control of the Degree of Articulation in HMM-based Speech Synthesis
Picart, Benjamin
Drugman, Thomas
Dutoit, Thierly
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1808 - 1811
[35] QUALITY CONTROL OF AUTOMATIC LABELLING USING HMM-BASED SYNTHESIS
Pammi, Sathish
Charfuelan, Marcela
Schroeder, Marc
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4277 - +
[36] FPGA Architecture of HMM-based Decoder Module in Speech Recognizer
Trang Hoang
Viet Vo Quoc
Truong Nguyen Ly Thien
2012 INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND INFORMATION SCIENCES (ICCAIS), 2012, : 354 - 358
[37] x Formant-controlled HMM-based Speech Synthesis
Lei, Ming
Yamagishi, Junichi
Richmond, Korin
Ling, Zhen-Hua
King, Simon
Dai, Li-Rong
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2788 - +
[38] Automatic Variation of the Degree of Articulation in New HMM-Based Voices
Picart, Benjamin
Drugman, Thomas
Dutoit, Thierry
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2014, 8 (02) : 307 - 322
[39] A Covariance-Tying Technique for HMM-Based Speech Synthesis
Oura, Keiichiro
Zen, Heiga
Nankaku, Yoshihiko
Lee, Akinobu
Tokuda, Keiichi
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (03): : 595 - 601
[40] Data Selection and Adaptation for Naturalness in HMM-based Speech Synthesis
Cooper, Erica
Chang, Alison
Levitan, Yocheved
Hirschberg, Julia
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 357 - +

← 1 2 3 4 5 →