CONVERTING WRITTEN LANGUAGE TO SPOKEN LANGUAGE WITH NEURAL MACHINE TRANSLATION FOR LANGUAGE MODELING

被引:0
作者
Ando, Shintaro [1 ,2 ]
Suzuki, Masayuki [2 ]
Itoh, Nobuyasu [2 ]
Kurata, Gakuto [2 ]
Minematsu, Nobuaki [1 ]
机构
[1] Univ Tokyo, Grad Sch Engn, Tokyo, Japan
[2] IBM Res AI, Tokyo, Japan
来源
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年
关键词
spontaneous speech; parallel corpus; Transformer; domain adaptation; TEXT;
D O I
10.1109/icassp40776.2020.9053226
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
When building a language model (LM) for spontaneous speech, the ideal situation is to have a large amount of spoken, in-domain training data. Having such abundant data, however, is not realistic. We address this problem by generating texts in spoken language from those in written language by using a neural machine translation (NMT) model. We collected faithful transcripts of fully spontaneous speech and corresponding written versions and used them as a parallel corpus to train the NMT model. We used top-k random sampling, which generates a large variety of texts of higher quality as compared to other generation methods for NMT. We indicate that the NMT model is capable of converting written texts in a certain domain to spoken texts, and that the converted texts are effective for training LMs. Our experimental results show significant improvement of speech recognition accuracy with the LMs.
引用
收藏
页码:8124 / 8128
页数:5
相关论文
共 21 条
[1]   Statistical Transformation of Language and Pronunciation Models for Spontaneous Speech Recognition [J].
Akita, Yuya ;
Kawahara, Tatsuya .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (06) :1539-1549
[2]  
Akita Yuya, 13 ANN C INT SPEECH
[3]  
[Anonymous], 1996, P INT C SPOKEN LANGU
[4]  
Bell Alan R., 1999, INT C PHON SCI ICPHS
[5]  
Chen S.F., 1996, P ACL
[6]  
Fan A, 2018, PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, P889
[7]  
Gu J., 2019, ARXIV190511006
[8]  
Holtzman Ari, 2019, Proc, ICLR
[9]  
Itoh Nobuyasu, 2010, SIG SLP INFORM PROCE, V83, P1
[10]  
Kudo T, 2018, PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, P66