Unsupervised phonetic and word level discovery for speech to speech translation for unwritten languages

被引：2

作者：

Hillis, Steven ^{[1
]}

Kumar, Anushree Prasanna ^{[1
]}

Black, Alan W. ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA

来源：

INTERSPEECH 2019 | 2019年

关键词：

speech-to-speech; machine translation; segmentation; unit discovery; low-resource; unwritten languages; Wilderness;

D O I：

10.21437/Interspeech.2019-3026

中图分类号：

R36 [病理学]; R76 [耳鼻咽喉科学];

学科分类号：

100104 ; 100213 ;

摘要：

We experiment with unsupervised methods for deriving and clustering symbolic representations of speech, working towards speech-to-speech translation for languages without regular (or any) written representations. We consider five low-resource African languages, and we produce three different segmental representations of text data for comparisons against four different segmental representations derived solely from acoustic data for each language. The text and speech data for each language comes from the CMU Wilderness dataset introduced in [1], where speakers read a version of the New Testament in their language. Our goal is to evaluate the translation performance not only of acoustically derived units but also of discovered sequences or "words" made from these units, with the intuition that such representations will encode more meaning than phones alone. We train statistical machine translation models for each representation and evaluate their outputs on the basis of BLEU-1 scores to determine their efficacy. Our experiments produce encouraging results: as we cluster our atomic phonetic representations into more word-like units, the amount information retained generally approaches that of the actual words themselves.

引用

页码：1138 / 1142

页数：5

共 35 条

[1] [Anonymous], 2017, P ICNLSSP CAS MOR
[2] Baljekar P, 2015, 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, P3194
[3] Low-Resource Speech-to-Text Translation
Bansal, Sameer
Kamper, Herman
Livescu, Karen
Lopez, Adam
Goldwater, Sharon
[J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1298 - 1302
[4] Bansal Sameer, 2018, ARXIV180901431
[5] Bérard A, 2018, 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), P6224, DOI 10.1109/ICASSP.2018.8461690
[6] Berard Alexandre, 2016, NIPS WORKSH END TO E
[7] Towards speech translation of non written languages
Besacier, Laurent
Zhou, Bowen
Gao, Yuqing
[J]. 2006 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, 2006, : 222 - +
[8] Black A. W., 2019, ICASSP
[9] Black A. W., 2006, 9 INT C SPOK LANG PR
[10] Gage P., 1994, The C Users Journal, V12, P2338, DOI DOI 10.5555/177910.177914

← 1 2 3 4 →