Embedded Unit Selection Text-to-Speech Synthesis for Mobile Devices

被引:15
|
作者
Karabetsos, Sotiris [1 ]
Tsiakoulis, Pirros [1 ]
Chalamandaris, Aimilios [1 ]
Raptis, Spyros [1 ]
机构
[1] Inst Language & Speech Proc RC Athena, Dept Voice & Sound Technol, GR-15125 Athens, Greece
关键词
Embedded Speech Synthesis; Unit Selection; Text-to-Speech; Mobile Devices; Mobile Phones;
D O I
10.1109/TCE.2009.5174430
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Nowadays, unit selection based text-to-speech technology is the mainstream approach for near natural speech,synthesis systems. However, this is achieved at the expense of raised requirements in terms of computational resources. This work describes design and implementation approaches for the efficient integration of this technology in computational environments with limited resources, such as mobile devices, with no considerable speech quality degradation. In particular, the issues of database reduction, acoustic inventory compression and runtime computational load minimization are mainly addressed in this paper. Both objective and subjective assessments confirm the effectiveness of these approaches in terms of constructing a general purpose embedded unit selection TTS system and reducing the computational requirements while maintaining high speech quality(1).
引用
收藏
页码:613 / 621
页数:9
相关论文
共 50 条
  • [21] Evaluating Arabic Text-To-Speech Synthesizers for Mobile Phones
    AlRouqi, Hend
    Alhadhrami, Suheer
    Al-Khalifa, Hend S.
    Al-Salman, AbdulMalik S.
    Alarifi, Abdulrahman
    Alnafessah, Ahmad
    Al-Ammar, Mai A.
    2015 Tenth International Conference on Digital Information Management (ICDIM), 2015, : 41 - 46
  • [22] Siri On-Device Deep Learning-Guided Unit Selection Text-to-Speech System
    Capes, Tim
    Coles, Paul
    Conkie, Alistair
    Golipour, Ladan
    Hadjitarkhani, Abie
    Hu, Qiong
    Huddleston, Nancy
    Hunt, Melvyn
    Li, Jiangchuan
    Neeracher, Matthias
    Prahallad, Kishore
    Raitio, Tuomo
    Rasipuram, Ramya
    Townsend, Greg
    Williamson, Becci
    Winarsky, David
    Wu, Zhizheng
    Zhang, Hepeng
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 4011 - 4015
  • [23] RyanSpeech: A Corpus for Conversational Text-to-Speech Synthesis
    Zandie, Rohola
    Mahoor, Mohammad H.
    Madsen, Julia
    Emamian, Eshrat S.
    INTERSPEECH 2021, 2021, : 2751 - 2755
  • [24] Paraphrase generation to improve Text-To-Speech Synthesis
    Putois, Ghislain
    Chevelu, Jonathan
    Boidin, Cedric
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 198 - 201
  • [25] A unit selection text-to-speech-and-singing synthesis framework from neutral speech: proof of concept
    Marc Freixes
    Francesc Alías
    Joan Claudi Socoró
    EURASIP Journal on Audio, Speech, and Music Processing, 2019
  • [26] A unit selection text-to-speech-and-singing synthesis framework from neutral speech: proof of concept
    Freixes, Marc
    Alias, Francesc
    Claudi Socoro, Joan
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2019, 2019 (01)
  • [27] Modeling stylized invariance and local variability of prosody in text-to-speech synthesis
    Chu, Min
    Zhao, Yong
    Chang, Eric
    SPEECH COMMUNICATION, 2006, 48 (06) : 716 - 726
  • [28] Preparation of sound base for a text-to-speech synthesis system
    Degtyarev, VM
    Gusev, MN
    Eighth International Workshop on Nondestructive Testing and Computer Simulations in Science and Engineering, 2005, 5831 : 207 - 213
  • [29] SIGNIFICANCE OF VOWEL EPENTHESIS IN TELUGU TEXT-TO-SPEECH SYNTHESIS
    Peddinti, Vijayaditya
    Prahallad, Kishore
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5348 - 5351
  • [30] NORMALIZATION OF TEXT MESSAGES FOR TEXT-TO-SPEECH
    Pennell, Deana L.
    Liu, Yang
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4842 - 4845