MuST-C: A multilingual corpus for end-to-end speech translation

被引：70

作者：

Cattoni, Roldano ^{[1
]}

Di Gangi, Mattia Antonino ^{[1
,2
]}

Bentivogli, Luisa ^{[1
]}

Negri, Matteo ^{[1
]}

Turchi, Marco ^{[1
]}

机构：

[1] Fdn Bruno Kessler, Via Sommar 18, I-38123 Povo, TN, Italy

[2] Univ Trento, Trento, Italy

来源：

COMPUTER SPEECH AND LANGUAGE | 2021年 / 66卷

关键词：

Spoken language translation; Multilingual corpus;

D O I：

10.1016/j.csl.2020.101155

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

End-to-end spoken language translation (SLT) has recently gained popularity thanks to the advancement of sequence to sequence learning in its two parent tasks: automatic speech recognition (ASR) and machine translation (MT). However, research in the field has to confront with the scarcity of publicly available corpora to train data-hungry neural networks. Indeed, while traditional cascade solutions can build on sizable ASR and MT training data for a variety of languages, the available SLT corpora suitable for end-to-end training are few, typically small and of limited language coverage. We contribute to fill this gap by presenting MuST-C, a large and freely available Multilingual Speech Translation Corpus built from English TED Talks. Its unique features include: i) language coverage and diversity (from English into 14 languages from different families), ii) size (at least 237 hours of transcribed recordings per language, 430 on average), iii) variety of topics and speakers, and iv) data quality. Besides describing the corpus creation methodology and discussing the outcomes of empirical and manual quality evaluations, we present baseline results computed with strong systems on each language direction covered by MuST-C. (C) 2020 Elsevier Ltd. All rights reserved.

引用

页数：14

共 50 条

[1] MuST-C: a Multilingual Speech Translation Corpus
Di Gangi, Mattia Antonino
Cattoni, Roldano
Bentivogli, Luisa
Negri, Matteo
Turchi, Marco
2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 2012 - 2017
[2] MULTILINGUAL END-TO-END SPEECH TRANSLATION
Inaguma, Hirofumi
Duh, Kevin
Kawahara, Tatsuya
Watanabe, Shinji
2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 570 - 577
[3] Towards a Deep Understanding of Multilingual End-to-End Speech Translation
Sun, Haoran
Zhao, Xiaohu
Lei, Yikun
Zhu, Shaolin
Xiong, Deyi
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 14332 - 14348
[4] ONE-TO-MANY MULTILINGUAL END-TO-END SPEECH TRANSLATION
Di Gangi, Mattia A.
Negri, Matteo
Turchi, Marco
2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 585 - 592
[5] Edinburgh's End-to-End Multilingual Speech Translation System for IWSLT 2021
Zhang, Biao
Sennrich, Rico
IWSLT 2021: THE 18TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION, 2021, : 160 - 168
[6] Speech Segmentation Optimization using Segmented Bilingual Speech Corpus for End-to-end Speech Translation
Fukuda, Ryo
Sudoh, Katsuhito
Nakamura, Satoshi
INTERSPEECH 2022, 2022, : 121 - 125
[7] End-to-End Speech Translation for Code Switched Speech
Weller, Orion
Sperber, Matthias
Pires, Telmo
Setiawan, Hendra
Gollan, Christian
Telaar, Dominic
Paulik, Matthias
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 1435 - 1448
[8] MULTILINGUAL SPEECH RECOGNITION WITH A SINGLE END-TO-END MODEL
Toshniwal, Shubham
Sainath, Tara N.
Weiss, Ron J.
Li, Bo
Moreno, Pedro
Weinstein, Eugene
Rao, Kanishka
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4904 - 4908
[9] End-to-End Speech Translation with Adversarial Training
Li, Xuancai
Chen, Kehai
Zhao, Tiejun
Yang, Muyun
WORKSHOP ON AUTOMATIC SIMULTANEOUS TRANSLATION CHALLENGES, RECENT ADVANCES, AND FUTURE DIRECTIONS, 2020, : 10 - 14
[10] END-TO-END AUTOMATIC SPEECH TRANSLATION OF AUDIOBOOKS
Berard, Alexandre
Besacier, Laurent
Kocabiyikoglu, Ali Can
Pietquin, Olivier
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6224 - 6228

← 1 2 3 4 5 →