Flexible speech translation systems

被引:8
|
作者
Schultz, T [1 ]
Black, AW [1 ]
Vogel, S [1 ]
Woszczyna, M [1 ]
机构
[1] Carnegie Mellon Univ, Interact Syst Lab, Pittsburgh, PA 15213 USA
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2006年 / 14卷 / 02期
关键词
multilinguality; portability; speech translation; system deployment;
D O I
10.1109/TSA.2005.860768
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speech translation research has made significant progress over the years with many high-visibility efforts showing that translation of spontaneously spoken speech from and to diverse languages is possible and applicable in a variety of domains. As language and domains continue to expand, practical concerns such as portability and reconfigurability of speech come into play: system maintenance becomes a key issue and data is never sufficient to cover the changing domains over varying languages. In this paper, we discuss strategies to overcome the limits of today's speech translation systems. In the first part, we describe our layered system architecture that allows for easy component integration, resource sharing across components, comparison of alternative approaches, and the migration toward hybrid desktop/PDA or stand-alone PDA systems. In the second part, we show how flexibility and reconfigurability is implemented by more radically relying on learning approaches and use our English-Thai two-way speech translation system as a concrete example.
引用
收藏
页码:403 / 411
页数:9
相关论文
共 50 条
  • [31] Improving Automatic Speech Recognition and Speech Translation via Word Embedding Prediction
    Chuang, Shun-Po
    Liu, Alexander H.
    Sung, Tzu-Wei
    Lee, Hung-yi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 93 - 105
  • [32] Real-Time Statistical Speech Translation
    Wolk, Krzysztof
    Marasek, Krzysztof
    NEW PERSPECTIVES IN INFORMATION SYSTEMS AND TECHNOLOGIES, VOL 1, 2014, 275 : 107 - 113
  • [33] Speech Segmentation Optimization using Segmented Bilingual Speech Corpus for End-to-end Speech Translation
    Fukuda, Ryo
    Sudoh, Katsuhito
    Nakamura, Satoshi
    INTERSPEECH 2022, 2022, : 121 - 125
  • [34] Evaluation of 2-way Iraqi Arabic-English speech translation systems using automated metrics
    Condon, Sherri
    Arehart, Mark
    Parvaz, Dan
    Sanders, Gregory
    Doran, Christy
    Aberdeen, John
    MACHINE TRANSLATION, 2012, 26 (1-2) : 159 - 176
  • [35] AUTOMATIC PRONUNCIATION PREDICTION FOR TEXT-TO-SPEECH SYNTHESIS OF DIALECTAL ARABIC IN A SPEECH-TO-SPEECH TRANSLATION SYSTEM
    Ananthakrishnan, Sankaranarayanan
    Tsakalidis, Stavros
    Prasad, Rohit
    Natarajan, Prem
    Vembu, Aravind Namandi
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4957 - 4960
  • [36] From Start to Finish: Latency Reduction Strategies for Incremental Speech Synthesis in Simultaneous Speech-to-Speech Translation
    Liu, Danni
    Wang, Changhan
    Gong, Hongyu
    Ma, Xutai
    Tang, Yun
    Pino, Juan
    INTERSPEECH 2022, 2022, : 1771 - 1775
  • [37] RAPID INTEGRATION OF PARTS OF SPEECH INFORMATION TO IMPROVE REORDERING MODEL FOR ENGLISH-FARSI SPEECH TO SPEECH TRANSLATION
    Maskey, Sameer
    Zhou, Bowen
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5222 - 5225
  • [38] Leveraging Pseudo-labeled Data to Improve Direct Speech-to-Speech Translation
    Dong, Qianqian
    Yue, Fengpeng
    Ko, Tom
    Wang, Mingxuan
    Bai, Qibing
    Zhang, Yu
    INTERSPEECH 2022, 2022, : 1781 - 1785
  • [39] Language model adaptation in machine translation from speech
    Bulyko, Ivan
    Matsoukas, Spyros
    Schwartz, Richard
    Nguyen, Long
    Makhoul, John
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 117 - +
  • [40] High-quality Speech Translation in the Flight Domain
    Wang, Chao
    Seneff, Stephanie
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 761 - +