Flexible speech translation systems

被引:8
|
作者
Schultz, T [1 ]
Black, AW [1 ]
Vogel, S [1 ]
Woszczyna, M [1 ]
机构
[1] Carnegie Mellon Univ, Interact Syst Lab, Pittsburgh, PA 15213 USA
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2006年 / 14卷 / 02期
关键词
multilinguality; portability; speech translation; system deployment;
D O I
10.1109/TSA.2005.860768
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speech translation research has made significant progress over the years with many high-visibility efforts showing that translation of spontaneously spoken speech from and to diverse languages is possible and applicable in a variety of domains. As language and domains continue to expand, practical concerns such as portability and reconfigurability of speech come into play: system maintenance becomes a key issue and data is never sufficient to cover the changing domains over varying languages. In this paper, we discuss strategies to overcome the limits of today's speech translation systems. In the first part, we describe our layered system architecture that allows for easy component integration, resource sharing across components, comparison of alternative approaches, and the migration toward hybrid desktop/PDA or stand-alone PDA systems. In the second part, we show how flexibility and reconfigurability is implemented by more radically relying on learning approaches and use our English-Thai two-way speech translation system as a concrete example.
引用
收藏
页码:403 / 411
页数:9
相关论文
共 50 条
  • [41] Low-Resource Speech-to-Text Translation
    Bansal, Sameer
    Kamper, Herman
    Livescu, Karen
    Lopez, Adam
    Goldwater, Sharon
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1298 - 1302
  • [42] Relative Positional Encoding for Speech Recognition and Direct Translation
    Pham, Ngoc-Quan
    Ha, Thanh-Le
    Nguyen, Tuan-Nam
    Nguyen, Thai-Son
    Salesky, Elizabeth
    Stuker, Sebastian
    Niehues, Jan
    Waibel, Alex
    INTERSPEECH 2020, 2020, : 31 - 35
  • [43] END-TO-END AUTOMATIC SPEECH TRANSLATION OF AUDIOBOOKS
    Berard, Alexandre
    Besacier, Laurent
    Kocabiyikoglu, Ali Can
    Pietquin, Olivier
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6224 - 6228
  • [44] For an interlanguage useful in automatic translation of speech in limited domains
    Bouillon, Pierrette
    Rayner, Manny
    Estella, Paula
    Gerlach, Johanna
    Georgescul, Maria
    TRAITEMENT AUTOMATIQUE DES LANGUES, 2011, 52 (01): : 133 - 160
  • [45] Dynamic Transcription for Low-latency Speech Translation
    Niehues, Jan
    Nguyen, Thai Son
    Cho, Eunah
    Ha, Thanh-Le
    Kilgour, Kevin
    Mueller, Markus
    Sperber, Matthias
    Stueker, Sebastian
    Waibel, Alex
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2513 - 2517
  • [46] Lost in Interpreting: Speech Translation from Source or Interpreter?
    Machacek, Dominik
    Zilinec, Matus
    Bojar, Ondrej
    INTERSPEECH 2021, 2021, : 2376 - 2380
  • [47] CASCADED MODELS WITH CYCLIC FEEDBACK FOR DIRECT SPEECH TRANSLATION
    Lam, Tsz Kin
    Schamoni, Shigehiko
    Riezler, Stefan
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7508 - 7512
  • [48] A Modular Approach for Romanian-English Speech Translation
    Avram, Andrei-Marius
    Pais, Vasile
    Tufis, Dan
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2021), 2021, 12801 : 57 - 63
  • [49] Consolidation-Based Speech Translation and Evaluation Approach
    Hori, Chiori
    Zhao, Bing
    Vogel, Stephan
    Waibel, Alex
    Kashioka, Hideki
    Nakamura, Satoshi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2009, E92D (03) : 477 - 488
  • [50] Data Augmentation for Pipeline-Based Speech Translation
    Alves, Diego
    Salimbajevs, Askars
    Pinnis, Marcis
    HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE (HLT 2020), 2020, 328 : 73 - 79