A survey on speech synthesis techniques in Indian languages

被引:0
作者
Soumya Priyadarsini Panda
Ajit Kumar Nayak
Satyananda Champati Rai
机构
[1] Silicon Institute of Technology,Department of CSE
[2] Siksha ‘O’ Anusandhan University,Department of CS and IT
[3] Silicon Institute of Technology,Department of IT
来源
Multimedia Systems | 2020年 / 26卷
关键词
Text to speech system; Speech synthesis; Indian languages; Concatenative synthesis; Formant synthesis; Articulatory synthesis; Syllable-based synthesis; HMM-based synthesis; Statistical parametric synthesis; Polyglot synthesis; Multilingual synthesis; Waveform concatenation, Deep learning;
D O I
暂无
中图分类号
学科分类号
摘要
The text to speech technology has achieved significant progress during the past decade and is an active area of research and development in providing different human–computer interactive systems. Even though a number of speech synthesis models are available for different languages focusing on the domain requirements with many motive applications, a source of information on current trends in Indian language speech synthesis is unavailable till date making it difficult for the beginners to initiate research for the development of TTS systems for the low-resourced languages. This paper provides a review of the contributions made by different researchers in the field of Indian language speech synthesis along with a study on the Indian language characteristics and the associated challenges in designing TTS systems. A set of available applications and tools results out of different projects undertaken by different organizations along with a set of possible future developments are also discussed to provide a single reference to an important strand of research in speech synthesis which may benefit anyone interested to initiate research in this area.
引用
收藏
页码:453 / 478
页数:25
相关论文
共 50 条
[21]   Approximate string matching techniques for effective CLIR among Indian languages [J].
Makin, Ranbeer ;
Pandey, Nikita ;
Pingali, Prasad ;
Varma, Vasudeva .
APPLICATIONS OF FUZZY SETS THEORY, 2007, 4578 :430-+
[22]   Phonetic alignment for speech synthesis in under-resourced languages [J].
van Niekerk, D. R. ;
Barnard, E. .
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, :856-+
[23]   Development And Suitability Of Indian Languages Speech Database For Building Watson Based ASR System [J].
Pandey, Dipti ;
Mondal, Tapabrata ;
Agrawal, S. S. ;
Bangalore, Srinivas .
2013 INTERNATIONAL CONFERENCE ORIENTAL COCOSDA HELD JOINTLY WITH 2013 CONFERENCE ON ASIAN SPOKEN LANGUAGE RESEARCH AND EVALUATION (O-COCOSDA/CASLRE), 2013,
[24]   An Exploration Towards Joint Acoustic Modeling for Indian Languages: IIIT-H submission for Low Resource Speech Recognition Challenge for Indian languages, INTERSPEECH 2018 [J].
Vydana, Hari Krishna ;
Gurugubelli, Krishna ;
Raju, V. V. V. ;
Vuppala, Anil Kumar .
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, :3192-3196
[25]   EVALUATION OF SPEECH SYNTHESIS TECHNIQUES IN A COMPREHENSION TASK [J].
SYDESERFF, HA ;
CALEY, RJ ;
ISARD, SD ;
JACK, MA ;
MONAGHAN, AIC ;
VERHOEVEN, J .
SPEECH COMMUNICATION, 1992, 11 (2-3) :189-194
[26]   Multilingual Text-to-Speech Synthesis for Turkic Languages Using Transliteration [J].
Yeshpanov, Rustem ;
Mussakhojayeva, Saida ;
Khassanov, Yerbolat .
INTERSPEECH 2023, 2023, :5521-5525
[27]   Text to Speech Synthesis for Ethiopian Semitic Languages: Issues and the Way Forward [J].
Hagos, Lemlem ;
Meshesha, Million .
PROCEEDINGS OF THE 2015 12TH IEEE AFRICON INTERNATIONAL CONFERENCE - GREEN INNOVATION FOR AFRICAN RENAISSANCE (AFRICON), 2015,
[28]   Meta Learning Text-to-Speech Synthesis in over 7000 Languages [J].
Lux, Florian ;
Meyer, Sarina ;
Behringer, Lyonel ;
Zalkow, Frank ;
Do, Phat ;
Coler, Matt ;
Habets, Emanuel A. P. ;
Ngoc Thang Vu .
INTERSPEECH 2024, 2024, :4958-4962
[29]   AUTOMATIC DISCOVERY OF A PHONETIC INVENTORY FOR UNWRITTEN LANGUAGES FOR STATISTICAL SPEECH SYNTHESIS [J].
Muthukumar, Prasanna Kumar ;
Black, Alan W. .
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[30]   Comparing Speech Enhancement Techniques for Voice Adaptation-Based Speech Synthesis [J].
Eng, Nicholas ;
Hui, C. T. Justine ;
Hioka, Yusuke ;
Watson, Catherine, I .
INTERSPEECH 2021, 2021, :2761-2765