MuST-C: A multilingual corpus for end-to-end speech translation

被引:70
|
作者
Cattoni, Roldano [1 ]
Di Gangi, Mattia Antonino [1 ,2 ]
Bentivogli, Luisa [1 ]
Negri, Matteo [1 ]
Turchi, Marco [1 ]
机构
[1] Fdn Bruno Kessler, Via Sommar 18, I-38123 Povo, TN, Italy
[2] Univ Trento, Trento, Italy
来源
关键词
Spoken language translation; Multilingual corpus;
D O I
10.1016/j.csl.2020.101155
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
End-to-end spoken language translation (SLT) has recently gained popularity thanks to the advancement of sequence to sequence learning in its two parent tasks: automatic speech recognition (ASR) and machine translation (MT). However, research in the field has to confront with the scarcity of publicly available corpora to train data-hungry neural networks. Indeed, while traditional cascade solutions can build on sizable ASR and MT training data for a variety of languages, the available SLT corpora suitable for end-to-end training are few, typically small and of limited language coverage. We contribute to fill this gap by presenting MuST-C, a large and freely available Multilingual Speech Translation Corpus built from English TED Talks. Its unique features include: i) language coverage and diversity (from English into 14 languages from different families), ii) size (at least 237 hours of transcribed recordings per language, 430 on average), iii) variety of topics and speakers, and iv) data quality. Besides describing the corpus creation methodology and discussing the outcomes of empirical and manual quality evaluations, we present baseline results computed with strong systems on each language direction covered by MuST-C. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Improving End-to-End Speech Translation by Leveraging Auxiliary Speech and Text Data
    Zhang, Yuhao
    Xu, Chen
    Hu, Bojie
    Zhang, Chunliang
    Xiao, Tong
    Zhu, Jingbo
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 11, 2023, : 13984 - 13992
  • [42] Large-Scale Multilingual Speech Recognition with a Streaming End-to-End Model
    Kannan, Anjuli
    Datta, Arindrima
    Sainath, Tara N.
    Weinstein, Eugene
    Ramabhadran, Bhuvana
    Wu, Yonghui
    Bapna, Ankur
    Chen, Zhifeng
    Lee, Seungji
    INTERSPEECH 2019, 2019, : 2130 - 2134
  • [43] AN EMPIRICAL STUDY OF END-TO-END SIMULTANEOUS SPEECH TRANSLATION DECODING STRATEGIES
    Ha Nguyen
    Esteve, Yannick
    Besacier, Laurent
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7528 - 7532
  • [44] Self-Supervised Representations Improve End-to-End Speech Translation
    Wu, Anne
    Wang, Changhan
    Pino, Juan
    Gu, Jiatao
    INTERSPEECH 2020, 2020, : 1491 - 1495
  • [45] End-to-end Speech Translation by Integrating Cross-modal Information
    Liu Y.-C.
    Zong C.-Q.
    Ruan Jian Xue Bao/Journal of Software, 2023, 34 (04): : 1837 - 1849
  • [46] SimulMT to SimulST: Adapting Simultaneous Text Translation to End-to-End Simultaneous Speech Translation
    Ma, Xutai
    Pino, Juan
    Koehn, Philipp
    1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 582 - 587
  • [47] END-TO-END SPEECH TRANSLATION WITH SELF-CONTAINED VOCABULARY MANIPULATION
    Tu, Mei
    Zhang, Fan
    Liu, Wei
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7929 - 7933
  • [48] Revisiting End-to-End Speech-to-Text Translation From Scratch
    Zhang, Biao
    Haddow, Barry
    Sennrich, Rico
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [49] Neural End-To-End Speech Translation Leveraged by ASR Posterior Distribution
    Ko, Yuka
    Sudoh, Katsuhito
    Sakti, Sakriani
    Nakamura, Satoshi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2024, E107D (10) : 1322 - 1331
  • [50] Modality Adaption or Regularization? A Case Study on End-to-End Speech Translation
    Han, Yuchen
    Xu, Chen
    Xiao, Tong
    Zhu, Jingbo
    61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 1340 - 1348