Embedded Unit Selection Text-to-Speech Synthesis for Mobile Devices

被引：15

作者：

Karabetsos, Sotiris ^{[1
]}

Tsiakoulis, Pirros ^{[1
]}

Chalamandaris, Aimilios ^{[1
]}

Raptis, Spyros ^{[1
]}

机构：

[1] Inst Language & Speech Proc RC Athena, Dept Voice & Sound Technol, GR-15125 Athens, Greece

来源：

IEEE TRANSACTIONS ON CONSUMER ELECTRONICS | 2009年 / 55卷 / 02期

关键词：

Embedded Speech Synthesis; Unit Selection; Text-to-Speech; Mobile Devices; Mobile Phones;

D O I：

10.1109/TCE.2009.5174430

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Nowadays, unit selection based text-to-speech technology is the mainstream approach for near natural speech,synthesis systems. However, this is achieved at the expense of raised requirements in terms of computational resources. This work describes design and implementation approaches for the efficient integration of this technology in computational environments with limited resources, such as mobile devices, with no considerable speech quality degradation. In particular, the issues of database reduction, acoustic inventory compression and runtime computational load minimization are mainly addressed in this paper. Both objective and subjective assessments confirm the effectiveness of these approaches in terms of constructing a general purpose embedded unit selection TTS system and reducing the computational requirements while maintaining high speech quality(1).

引用

页码：613 / 621

页数：9

共 50 条

[41] Text aware Emotional Text-to-speech with BERT
Mukherjee, Arijit
Bansal, Shubham
Satpal, Sandeepkumar
Mehta, Rupesh
INTERSPEECH 2022, 2022, : 4601 - 4605
[42] [Invited] Generative Model-Based Text-to-Speech Synthesis
Zen, Heiga
2018 IEEE 7TH GLOBAL CONFERENCE ON CONSUMER ELECTRONICS (GCCE 2018), 2018, : 327 - 328
[43] Development of an automatic phonetization system for Arabic text-to-speech synthesis
Imedjdouben, Faycal
Houacine, Amrane
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2014, 17 (04) : 417 - 426
[44] Text-To-Speech Intelligibility across Speech Rates
Syrdal, Ann K.
Bunnell, H. Timothy
Hertz, Susan R.
Mishra, Taniya
Spiegel, Murray
Bickley, Corine
Rekart, Deborah
Makashay, Matthew J.
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 622 - 625
[45] Challenges for Edge-AI Implementations of Text-To-Speech Synthesis
Bigioi, Dan
Corcoran, Peter
2021 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2021,
[46] Cross-Language Phonemisation In German Text-To-Speech Synthesis
Steigner, Jochen
Schroeder, Marc
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 833 - +
[47] Text-to-Speech With Lip Synchronization Based on Speech-Assisted Text-to-Video Alignment and Masked Unit Prediction
Ahn, Youngdo
Chae, Jongwook
Shin, Jong Won
IEEE SIGNAL PROCESSING LETTERS, 2025, 32 : 961 - 965
[48] Modeling the Acoustic Correlates of Expressive Elements in Text Genres for Expressive Text-to-Speech Synthesis
Yang, Hongwu
Meng, Helen M.
Cai, Lianhong
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1806 - 1809
[49] Design and Implementation of a Diacritic Arabic Text-To-Speech System
Amrouche, Aissa
Falek, Leila
Teffahi, Hocine
INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2017, 14 (04) : 488 - 494
[50] Expressive Text-to-Speech Synthesis using Text Chat Dataset with Speaking Style Information
Homma Y.
Kanagawa H.
Kobayashi N.
Ijima Y.
Saito K.
Transactions of the Japanese Society for Artificial Intelligence, 2023, 38 (03)

← 1 2 3 4 5 →