Learning Adaptive Segmentation Policy for End-to-End Simultaneous Translation

被引:0
|
作者
Zhang, Ruiqing [1 ]
He, Zhongjun [1 ]
Wu, Hua [1 ]
Wang, Haifeng [1 ]
机构
[1] Baidu Inc, 10,Shangdi 10th St, Beijing 100085, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
End-to-end simultaneous speech-to-text translation aims to directly perform translation from streaming source speech to target text with high translation quality and low latency. A typical simultaneous translation (ST) system consists of a speech translation model and a policy module, which determines when to wait and when to translate. Thus the policy is crucial to balance translation quality and latency. Conventional methods usually adopt fixed policies, e.g. segmenting the source speech with a fixed length and generating translation. However, this method ignores contextual information and suffers from low translation quality. This paper proposes an adaptive segmentation policy for end-toend ST. Inspired by human interpreters, the policy learns to segment the source streaming speech into meaningful units by considering both acoustic features and translation history, maintaining consistency between the segmentation and translation. Experimental results on English-German and Chinese-English show that our method achieves a good accuracy-latency trade-off over recently proposed state-of-the-art methods.
引用
收藏
页码:7862 / 7874
页数:13
相关论文
共 50 条
  • [1] End-to-End Simultaneous Speech Translation with Differentiable Segmentation
    Zhang, Shaolei
    Feng, Yang
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 7659 - 7680
  • [2] Impact of Encoding and Segmentation Strategies on End-to-End Simultaneous Speech Translation
    Nguyen, Ha
    Esteve, Yannick
    Besacier, Laurent
    INTERSPEECH 2021, 2021, : 2371 - 2375
  • [3] Learning Adaptive Segmentation Policy for Simultaneous Translation
    Zhang, Ruiqing
    Zhang, Chuanqiang
    He, Zhongjun
    Wu, Hua
    Wang, Haifeng
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 2280 - 2289
  • [4] SimulSpeech: End-to-End Simultaneous Speech to Text Translation
    Ren, Yi
    Liu, Jinglin
    Tan, Xu
    Zhang, Chen
    Qin, Tao
    Zhao, Zhou
    Liu, Tie-Yan
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3787 - 3796
  • [5] SimulSLT: End-to-End Simultaneous Sign Language Translation
    Yin, Aoxiong
    Zhao, Zhou
    Liu, Jinglin
    Jin, Weike
    Zhang, Meng
    Zeng, Xingshan
    He, Xiaofei
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4118 - 4127
  • [6] End-to-End Ultrametric Learning for Hierarchical Segmentation
    Lapertot, Raphael
    Chierchia, Giovanni
    Perret, Benjamin
    DISCRETE GEOMETRY AND MATHEMATICAL MORPHOLOGY, DGMM 2024, 2024, 14605 : 286 - 297
  • [7] SimulMT to SimulST: Adapting Simultaneous Text Translation to End-to-End Simultaneous Speech Translation
    Ma, Xutai
    Pino, Juan
    Koehn, Philipp
    1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 582 - 587
  • [8] Adaptive Feature Selection for End-to-End Speech Translation
    Zhang, Biao
    Titov, Ivan
    Haddow, Barry
    Sennrich, Rico
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 2533 - 2544
  • [9] SHAS: Approaching optimal Segmentation for End-to-End Speech Translation
    Tsiamas, Ioannis
    Gallego, Gerard I.
    Fonollosa, Jose A. R.
    Costa-jussa, Marta R.
    INTERSPEECH 2022, 2022, : 106 - 110
  • [10] MINTZAI: End-to-end Deep Learning for Speech Translation
    Etchegoyhen, Thierry
    Arzelus, Haritz
    Gete, Harritxu
    Alvarez, Aitor
    Hernaez, Inma
    Navas, Eva
    Gonzalez-Docasal, Ander
    Osacar, Jaime
    Benites, Edson
    Ellakuria, Igor
    Calonge, Eusebi
    Martin, Maite
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2020, (65): : 97 - 100