Recognition and Translation of Code-switching Speech Utterances

被引:0
作者
Nakayama, Sahoko [1 ]
Kano, Takatomo [1 ]
Tjandra, Andros [1 ]
Sakti, Sakriani [1 ,2 ]
Nakamura, Satoshi [1 ,2 ]
机构
[1] Nara Inst Sci & Technol, Nara, Japan
[2] Adv Intelligence Project AIP, RIKEN, Nara, Japan
来源
2019 22ND CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (O-COCOSDA) | 2019年
关键词
code-switching; speech recognition; speech and text translation; BERT; multi-task learning;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Code-switching (CS), a hallmark of worldwide bilingual communities, refers to a strategy adopted by bilinguals (or multilinguals) who mix two or more languages in a discourse often with little change of interlocutor or topic. The units and the locations of the switches may vary widely from single-word switches to whole phrases (beyond the length of the loanword units). Such phenomena pose challenges for spoken language technologies, i.e., automatic speech recognition (ASR), since the systems need to be able to handle the input in a multilingual setting. Several works constructed a CS ASR on many different language pairs. But the common aim of developing a CS ASR is merely for transcribing CS-speech utterances into CS-text sentences within a single individual. In contrast, in this study, we address the situational context that happens during dialogs between CS and non-CS (monolingual) speakers and support monolingual speakers who want to understand CS speakers. We construct a system that recognizes and translates from codeswitching speech to monolingual text. We investigated several approaches, including a cascade of ASR and a neural machine translation (NMT), a cascade of ASR and a deep bidirectional language model (BERT), an ASR that directly outputs monolingual transcriptions from CS speech, and multi-task learning. Finally, we evaluate and discuss these four ways on a JapaneseEnglish CS to English monolingual task.
引用
收藏
页码:34 / 39
页数:6
相关论文
共 27 条
  • [1] Anastasopoulos Antonios, 2018, P 2018 C N AM CHAPT, V1, P82, DOI DOI 10.18653/V1/N18-1008
  • [2] [Anonymous], 2017, LIBROSA 0 5 0
  • [3] [Anonymous], 2006, MECAB YET ANOTHER PA
  • [4] [Anonymous], 2007, International Journal of Computational Linguistics & Chinese Language Processing
  • [5] Bandanau D, 2016, INT CONF ACOUST SPEE, P4945, DOI 10.1109/ICASSP.2016.7472618
  • [6] Bautista Maria Lourdes S., 2004, Asia Pacific Education Review, V5, P226
  • [7] Chan W, 2016, INT CONF ACOUST SPEE, P4960, DOI 10.1109/ICASSP.2016.7472621
  • [8] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
  • [9] Fotos SandraS., 1990, PROC JALT J, V12, P75
  • [10] Ghazvininejad M., 2019, arXiv