MuST-C: A multilingual corpus for end-to-end speech translation

被引:70
|
作者
Cattoni, Roldano [1 ]
Di Gangi, Mattia Antonino [1 ,2 ]
Bentivogli, Luisa [1 ]
Negri, Matteo [1 ]
Turchi, Marco [1 ]
机构
[1] Fdn Bruno Kessler, Via Sommar 18, I-38123 Povo, TN, Italy
[2] Univ Trento, Trento, Italy
来源
关键词
Spoken language translation; Multilingual corpus;
D O I
10.1016/j.csl.2020.101155
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
End-to-end spoken language translation (SLT) has recently gained popularity thanks to the advancement of sequence to sequence learning in its two parent tasks: automatic speech recognition (ASR) and machine translation (MT). However, research in the field has to confront with the scarcity of publicly available corpora to train data-hungry neural networks. Indeed, while traditional cascade solutions can build on sizable ASR and MT training data for a variety of languages, the available SLT corpora suitable for end-to-end training are few, typically small and of limited language coverage. We contribute to fill this gap by presenting MuST-C, a large and freely available Multilingual Speech Translation Corpus built from English TED Talks. Its unique features include: i) language coverage and diversity (from English into 14 languages from different families), ii) size (at least 237 hours of transcribed recordings per language, 430 on average), iii) variety of topics and speakers, and iv) data quality. Besides describing the corpus creation methodology and discussing the outcomes of empirical and manual quality evaluations, we present baseline results computed with strong systems on each language direction covered by MuST-C. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Towards multilingual end-to-end speech recognition for air traffic control
    Lin, Yi
    Yang, Bo
    Guo, Dongyue
    Fan, Peng
    IET INTELLIGENT TRANSPORT SYSTEMS, 2021, 15 (09) : 1203 - 1214
  • [32] Curriculum Pre-training for End-to-End Speech Translation
    Wang, Chengyi
    Wu, Yu
    Liu, Shujie
    Zhou, Ming
    Yang, Zhenglu
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3728 - 3738
  • [33] Mutual-Learning Improves End-to-End Speech Translation
    Zhao, Jiawei
    Luo, Wei
    Chen, Boxing
    Gilman, Andrew
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3989 - 3994
  • [34] Improving End-to-End Speech Translation with Progressive Dual Encoding
    Zhang, Runlai
    Chen, Saihan
    Zhang, Yuhao
    Du, Yangfan
    Chen, Hao
    Xiao, Tong
    Zhu, Jingbo
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT III, NLPCC 2024, 2025, 15361 : 199 - 212
  • [35] TIGHT INTEGRATED END-TO-END TRAINING FOR CASCADED SPEECH TRANSLATION
    Bahar, Parnia
    Bieschke, Tobias
    Schlueter, Ralf
    Ney, Hermann
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 950 - 957
  • [36] Regularizing End-to-End Speech Translation with Triangular Decomposition Agreement
    Du, Yichao
    Zhang, Zhirui
    Wang, Weizhi
    Chen, Boxing
    Xie, Jun
    Xu, Tong
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 10590 - 10598
  • [37] Knowledge Distillation on Joint Task End-to-End Speech Translation
    Nayem, Khandokar Md
    Xue, Ran
    Chang, Ching-Yun
    Shanbhogue, Akshaya Vishnu Kudlu
    INTERSPEECH 2023, 2023, : 1493 - 1497
  • [38] SHAS: Approaching optimal Segmentation for End-to-End Speech Translation
    Tsiamas, Ioannis
    Gallego, Gerard I.
    Fonollosa, Jose A. R.
    Costa-jussa, Marta R.
    INTERSPEECH 2022, 2022, : 106 - 110
  • [39] PromptST: Abstract Prompt Learning for End-to-End Speech Translation
    Yu, Tengfei
    Ding, Liang
    Liu, Xuebo
    Chen, Kehai
    Zhang, Meishan
    Tao, Dacheng
    Zhang, Min
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 10140 - 10154
  • [40] Exploring Phoneme-Level Speech Representations for End-to-End Speech Translation
    Salesky, Elizabeth
    Sperber, Matthias
    Black, Alan W.
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1835 - 1841