Embedded Unit Selection Text-to-Speech Synthesis for Mobile Devices

被引:15
|
作者
Karabetsos, Sotiris [1 ]
Tsiakoulis, Pirros [1 ]
Chalamandaris, Aimilios [1 ]
Raptis, Spyros [1 ]
机构
[1] Inst Language & Speech Proc RC Athena, Dept Voice & Sound Technol, GR-15125 Athens, Greece
关键词
Embedded Speech Synthesis; Unit Selection; Text-to-Speech; Mobile Devices; Mobile Phones;
D O I
10.1109/TCE.2009.5174430
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Nowadays, unit selection based text-to-speech technology is the mainstream approach for near natural speech,synthesis systems. However, this is achieved at the expense of raised requirements in terms of computational resources. This work describes design and implementation approaches for the efficient integration of this technology in computational environments with limited resources, such as mobile devices, with no considerable speech quality degradation. In particular, the issues of database reduction, acoustic inventory compression and runtime computational load minimization are mainly addressed in this paper. Both objective and subjective assessments confirm the effectiveness of these approaches in terms of constructing a general purpose embedded unit selection TTS system and reducing the computational requirements while maintaining high speech quality(1).
引用
收藏
页码:613 / 621
页数:9
相关论文
共 50 条
  • [41] Text aware Emotional Text-to-speech with BERT
    Mukherjee, Arijit
    Bansal, Shubham
    Satpal, Sandeepkumar
    Mehta, Rupesh
    INTERSPEECH 2022, 2022, : 4601 - 4605
  • [42] [Invited] Generative Model-Based Text-to-Speech Synthesis
    Zen, Heiga
    2018 IEEE 7TH GLOBAL CONFERENCE ON CONSUMER ELECTRONICS (GCCE 2018), 2018, : 327 - 328
  • [43] Development of an automatic phonetization system for Arabic text-to-speech synthesis
    Imedjdouben, Faycal
    Houacine, Amrane
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2014, 17 (04) : 417 - 426
  • [44] Text-To-Speech Intelligibility across Speech Rates
    Syrdal, Ann K.
    Bunnell, H. Timothy
    Hertz, Susan R.
    Mishra, Taniya
    Spiegel, Murray
    Bickley, Corine
    Rekart, Deborah
    Makashay, Matthew J.
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 622 - 625
  • [45] Challenges for Edge-AI Implementations of Text-To-Speech Synthesis
    Bigioi, Dan
    Corcoran, Peter
    2021 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2021,
  • [46] Cross-Language Phonemisation In German Text-To-Speech Synthesis
    Steigner, Jochen
    Schroeder, Marc
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 833 - +
  • [47] Text-to-Speech With Lip Synchronization Based on Speech-Assisted Text-to-Video Alignment and Masked Unit Prediction
    Ahn, Youngdo
    Chae, Jongwook
    Shin, Jong Won
    IEEE SIGNAL PROCESSING LETTERS, 2025, 32 : 961 - 965
  • [48] Modeling the Acoustic Correlates of Expressive Elements in Text Genres for Expressive Text-to-Speech Synthesis
    Yang, Hongwu
    Meng, Helen M.
    Cai, Lianhong
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1806 - 1809
  • [49] Design and Implementation of a Diacritic Arabic Text-To-Speech System
    Amrouche, Aissa
    Falek, Leila
    Teffahi, Hocine
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2017, 14 (04) : 488 - 494
  • [50] Expressive Text-to-Speech Synthesis using Text Chat Dataset with Speaking Style Information
    Homma Y.
    Kanagawa H.
    Kobayashi N.
    Ijima Y.
    Saito K.
    Transactions of the Japanese Society for Artificial Intelligence, 2023, 38 (03)