Flexible speech translation systems

被引：8

作者：

Schultz, T ^{[1
]}

Black, AW ^{[1
]}

Vogel, S ^{[1
]}

Woszczyna, M ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Interact Syst Lab, Pittsburgh, PA 15213 USA

来源：

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2006年 / 14卷 / 02期

关键词：

multilinguality; portability; speech translation; system deployment;

D O I：

10.1109/TSA.2005.860768

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Speech translation research has made significant progress over the years with many high-visibility efforts showing that translation of spontaneously spoken speech from and to diverse languages is possible and applicable in a variety of domains. As language and domains continue to expand, practical concerns such as portability and reconfigurability of speech come into play: system maintenance becomes a key issue and data is never sufficient to cover the changing domains over varying languages. In this paper, we discuss strategies to overcome the limits of today's speech translation systems. In the first part, we describe our layered system architecture that allows for easy component integration, resource sharing across components, comparison of alternative approaches, and the migration toward hybrid desktop/PDA or stand-alone PDA systems. In the second part, we show how flexibility and reconfigurability is implemented by more radically relying on learning approaches and use our English-Thai two-way speech translation system as a concrete example.

引用

页码：403 / 411

页数：9

共 50 条

[21] Speech understanding and speech translation by maximum a-posteriori semantic decoding
Müller, J
Stahl, H
ARTIFICIAL INTELLIGENCE IN ENGINEERING, 1999, 13 (04): : 373 - 384
[22] Multilinguality in speech and spoken language systems
Waibel, A
Geutner, P
Tomokiyo, LM
Schultz, T
Woszczyna, M
PROCEEDINGS OF THE IEEE, 2000, 88 (08) : 1297 - 1313
[23] Infinite Lingos: A Straightforward Methodology for Speech Translation
Enam, Rabia Noor
Tahir, Muhammad
Mustafa, Syed Muhammad Nabeel
Shahid, Hasan
Ul Momineen, Noor
Naeem, Tashfa
Safdar, Summaiya
2022 GLOBAL CONFERENCE ON WIRELESS AND OPTICAL TECHNOLOGIES (GCWOT), 2022, : 6 - 10
[24] Cascade or Direct Speech Translation? A Case Study
Etchegoyhen, Thierry
Arzelus, Haritz
Gete, Harritxu
Alvarez, Aitor
Torre, Ivan G.
Martin-Donas, Juan Manuel
Gonzalez-Docasal, Ander
Fernandez, Edson Benites
APPLIED SCIENCES-BASEL, 2022, 12 (03):
[25] MULTILINGUAL END-TO-END SPEECH TRANSLATION
Inaguma, Hirofumi
Duh, Kevin
Kawahara, Tatsuya
Watanabe, Shinji
2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 570 - 577
[26] Joint Speech Translation and Named Entity Recognition
Gaido, Marco
Papi, Sara
Negri, Matteo
Turchi, Marco
INTERSPEECH 2023, 2023, : 47 - 51
[27] Evaluation of Alternatives on Speech to Sign Language Translation
San-Segundo, R.
Perez, A.
Ortiz, D.
D'Haro, L. F.
Torres, M. I.
Casacuberta, F.
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 53 - +
[28] Segmentation and Disfluency Removal for Conversational Speech Translation
Hassan, Hany
Schwartz, Lee
Hakkani-Tur, Dilek
Tur, Gokhan
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 318 - 322
[29] The Multilingual TEDx Corpus for Speech Recognition and Translation
Salesky, Elizabeth
Wiesner, Matthew
Bremerman, Jacob
Cattoni, Roldano
Negri, Matteo
Turchi, Marco
Oard, Douglas W.
Post, Matt
INTERSPEECH 2021, 2021, : 3655 - 3659
[30] Low-Latency Neural Speech Translation
Niehues, Jan
Ngoc-Quan Pham
Thanh-Le Ha
Sperber, Matthias
Waibel, Alex
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1293 - 1297

← 1 2 3 4 5 →