ASSESSING EVALUATION METRICS FOR SPEECH-TO-SPEECH TRANSLATION

被引:4
|
作者
Salesky, Elizabeth [1 ]
Maeder, Julian [2 ]
Klinger, Severin [2 ]
机构
[1] Johns Hopkins Univ, Baltimore, MD 21218 USA
[2] Swiss Fed Inst Technol, Zurich, Switzerland
来源
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU) | 2021年
关键词
evaluation; speech synthesis; speech translation; speech-to-speech; dialects;
D O I
10.1109/ASRU51503.2021.9688073
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Speech-to-speech translation combines machine translation with speech synthesis, introducing evaluation challenges not present in either task alone. How to automatically evaluate speech-to-speech translation is an open question which has not previously been explored. Translating to speech rather than to text is often motivated by unwritten languages or languages without standardized orthographies. However, we show that the previously used automatic metric for this task is best equipped for standardized high-resource languages only. In this work, we first evaluate current metrics for speech-to-speech translation, and second assess how translation to dialectal variants rather than to standardized languages impacts various evaluation methods.
引用
收藏
页码:733 / 740
页数:8
相关论文
共 50 条
  • [31] Evaluation of Alternatives on Speech to Sign Language Translation
    San-Segundo, R.
    Perez, A.
    Ortiz, D.
    D'Haro, L. F.
    Torres, M. I.
    Casacuberta, F.
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 53 - +
  • [32] Assessing the Tolerance of Neural Machine Translation Systems Against Speech Recognition Errors
    Ruiz, Nicholas
    Di Gangi, Mattia Antonino
    Bertoldi, Nicola
    Federico, Marcello
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2635 - 2639
  • [33] Speed or Accuracy? A Study in Evaluation of Simultaneous Speech Translation
    Mieno, Takashi
    Neubig, Graham
    Sakti, Sakriani
    Toda, Tomoki
    Nakamura, Satoshi
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2267 - 2271
  • [34] Consolidation-Based Speech Translation and Evaluation Approach
    Hori, Chiori
    Zhao, Bing
    Vogel, Stephan
    Waibel, Alex
    Kashioka, Hideki
    Nakamura, Satoshi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2009, E92D (03) : 477 - 488
  • [35] Evaluation of a speech translation system for travel conversation installed in PDA
    Mizutani, K
    Konuma, T
    Endo, M
    Nambu, T
    Wakita, Y
    CCNC 2004: 1ST IEEE CONSUMER COMMUNICATIONS AND NETWORKING CONFERENCE, PROCEEDINGS: CONSUMER NETWORKING: CLOSING THE DIGITAL DIVIDE, 2004, : 465 - 470
  • [36] Improving Speech Translation by Understanding the Speech From Latent Code
    Zhang, Hao
    Si, Nianwen
    Zhang, Wenlin
    Yang, Xukui
    Qu, Dan
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 (1259-1263) : 1259 - 1263
  • [37] Flexible speech translation systems
    Schultz, T
    Black, AW
    Vogel, S
    Woszczyna, M
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (02): : 403 - 411
  • [38] Assessing a Speaker for Fast Speech in Unit Selection Speech Synthesis
    Moers, Donata
    Wagner, Petra
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2015 - +
  • [39] Speech understanding and speech translation by maximum a-posteriori semantic decoding
    Müller, J
    Stahl, H
    ARTIFICIAL INTELLIGENCE IN ENGINEERING, 1999, 13 (04): : 373 - 384
  • [40] Speech Segmentation Optimization using Segmented Bilingual Speech Corpus for End-to-end Speech Translation
    Fukuda, Ryo
    Sudoh, Katsuhito
    Nakamura, Satoshi
    INTERSPEECH 2022, 2022, : 121 - 125