ASSESSING EVALUATION METRICS FOR SPEECH-TO-SPEECH TRANSLATION

被引：4

作者：

Salesky, Elizabeth ^{[1
]}

Maeder, Julian ^{[2
]}

Klinger, Severin ^{[2
]}

机构：

[1] Johns Hopkins Univ, Baltimore, MD 21218 USA

[2] Swiss Fed Inst Technol, Zurich, Switzerland

来源：

2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU) | 2021年

关键词：

evaluation; speech synthesis; speech translation; speech-to-speech; dialects;

D O I：

10.1109/ASRU51503.2021.9688073

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Speech-to-speech translation combines machine translation with speech synthesis, introducing evaluation challenges not present in either task alone. How to automatically evaluate speech-to-speech translation is an open question which has not previously been explored. Translating to speech rather than to text is often motivated by unwritten languages or languages without standardized orthographies. However, we show that the previously used automatic metric for this task is best equipped for standardized high-resource languages only. In this work, we first evaluate current metrics for speech-to-speech translation, and second assess how translation to dialectal variants rather than to standardized languages impacts various evaluation methods.

引用

页码：733 / 740

页数：8

共 50 条

[31] Evaluation of Alternatives on Speech to Sign Language Translation
San-Segundo, R.
Perez, A.
Ortiz, D.
D'Haro, L. F.
Torres, M. I.
Casacuberta, F.
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 53 - +
[32] Assessing the Tolerance of Neural Machine Translation Systems Against Speech Recognition Errors
Ruiz, Nicholas
Di Gangi, Mattia Antonino
Bertoldi, Nicola
Federico, Marcello
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2635 - 2639
[33] Speed or Accuracy? A Study in Evaluation of Simultaneous Speech Translation
Mieno, Takashi
Neubig, Graham
Sakti, Sakriani
Toda, Tomoki
Nakamura, Satoshi
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2267 - 2271
[34] Consolidation-Based Speech Translation and Evaluation Approach
Hori, Chiori
Zhao, Bing
Vogel, Stephan
Waibel, Alex
Kashioka, Hideki
Nakamura, Satoshi
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2009, E92D (03) : 477 - 488
[35] Evaluation of a speech translation system for travel conversation installed in PDA
Mizutani, K
Konuma, T
Endo, M
Nambu, T
Wakita, Y
CCNC 2004: 1ST IEEE CONSUMER COMMUNICATIONS AND NETWORKING CONFERENCE, PROCEEDINGS: CONSUMER NETWORKING: CLOSING THE DIGITAL DIVIDE, 2004, : 465 - 470
[36] Improving Speech Translation by Understanding the Speech From Latent Code
Zhang, Hao
Si, Nianwen
Zhang, Wenlin
Yang, Xukui
Qu, Dan
IEEE SIGNAL PROCESSING LETTERS, 2024, 31 (1259-1263) : 1259 - 1263
[37] Flexible speech translation systems
Schultz, T
Black, AW
Vogel, S
Woszczyna, M
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (02): : 403 - 411
[38] Assessing a Speaker for Fast Speech in Unit Selection Speech Synthesis
Moers, Donata
Wagner, Petra
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2015 - +
[39] Speech understanding and speech translation by maximum a-posteriori semantic decoding
Müller, J
Stahl, H
ARTIFICIAL INTELLIGENCE IN ENGINEERING, 1999, 13 (04): : 373 - 384
[40] Speech Segmentation Optimization using Segmented Bilingual Speech Corpus for End-to-end Speech Translation
Fukuda, Ryo
Sudoh, Katsuhito
Nakamura, Satoshi
INTERSPEECH 2022, 2022, : 121 - 125

← 1 2 3 4 5 →