Speech Corpus of Ainu Folklore and End-to-end Speech Recognition for Ainu Language

被引:0
|
作者
Matsuura, Kohei [1 ]
Ueno, Sei [1 ]
Mimura, Masato [1 ]
Sakai, Shinsuke [1 ]
Kawahara, Tatsuya [1 ]
机构
[1] Kyoto Univ, Grad Sch Informat, Sakyo Ku, Kyoto 6068501, Japan
关键词
Ainu speech corpus; low-resource language; end-to-end speech recognition; JAPANESE;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Ainu is an unwritten language that has been spoken by Ainu people who are one of the ethnic groups in Japan. It is recognized as critically endangered by UNESCO and archiving and documentation of its language heritage is of paramount importance. Although a considerable amount of voice recordings of Ainu folklore has been produced and accumulated to save their culture, only a quite limited parts of them are transcribed so far. Thus, we started a project of automatic speech recognition (ASR) for the Ainu language in order to contribute to the development of annotated language archives. In this paper, we report speech corpus development and the structure and performance of end-to-end ASR for Ainu. We investigated four modeling units (phone, syllable, word piece, and word) and found that the syllable-based model performed best in terms of both word and phone recognition accuracy, which were about 60% and over 85% respectively in speaker-open condition. Furthermore, word and phone accuracy of 80% and 90% has been achieved in a speaker-closed setting. We also found out that a multilingual ASR training with additional speech corpora of English and Japanese further improves the speaker-open test accuracy.
引用
收藏
页码:2622 / 2628
页数:7
相关论文
共 50 条
  • [41] Materials for the Study of the Ainu Language and Folklore
    Hestermann, P. F.
    ANTHROPOS, 1914, 9 (3-4) : 696 - 697
  • [42] END-TO-END SPEECH RECOGNITION WITH WORD-BASED RNN LANGUAGE MODELS
    Hori, Takaaki
    Cho, Jaejin
    Watanabe, Shinji
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 389 - 396
  • [43] ADVERSARIAL TRAINING OF END-TO-END SPEECH RECOGNITION USING A CRITICIZING LANGUAGE MODEL
    Liu, Alexander H.
    Lee, Hung-yi
    Lee, Lin-shan
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6176 - 6180
  • [44] AN END-TO-END LANGUAGE-TRACKING SPEECH RECOGNIZER FOR MIXED-LANGUAGE SPEECH
    Seki, Hiroshi
    Watanabe, Shinji
    Hori, Takaaki
    Le Roux, Jonathan
    Hershey, John R.
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4919 - 4923
  • [45] NEURAL-FST CLASS LANGUAGE MODEL FOR END-TO-END SPEECH RECOGNITION
    Bruguier, Antoine
    Le, Duc
    Prabhavalkar, Rohit
    Li, Dangna
    Liu, Zhe
    Wang, Bo
    Chang, Eun
    Peng, Fuchun
    Kalinli, Ozlem
    Seltzer, Michael L.
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6107 - 6111
  • [46] End-To-End deep neural models for Automatic Speech Recognition for Polish Language
    Pondel-Sycz, Karolina
    Pietrzak, Agnieszka Paula
    Szymla, Julia
    INTERNATIONAL JOURNAL OF ELECTRONICS AND TELECOMMUNICATIONS, 2024, 70 (02) : 315 - 321
  • [47] Location-Based End-to-End Speech Recognition with Multiple Language Models
    Lin, Zhijie
    Lin, Kaiyang
    Chen, Shiling
    Li, Linlin
    Zhao, Zhou
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 9975 - 9976
  • [48] END-TO-END TRAINING OF A LARGE VOCABULARY END-TO-END SPEECH RECOGNITION SYSTEM
    Kim, Chanwoo
    Kim, Sungsoo
    Kim, Kwangyoun
    Kumar, Mehul
    Kim, Jiyeon
    Lee, Kyungmin
    Han, Changwoo
    Garg, Abhinav
    Kim, Eunhyang
    Shin, Minkyoo
    Singh, Shatrughan
    Heck, Larry
    Gowda, Dhananjaya
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 562 - 569
  • [49] UNIFIED END-TO-END SPEECH RECOGNITION AND ENDPOINTING FOR FAST AND EFFICIENT SPEECH SYSTEMS
    Bijwadia, Shaan
    Chang, Shuo-yiin
    Li, Bo
    Sainath, Tara
    Zhang, Chao
    He, Yanzhang
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 310 - 316
  • [50] Domain Expansion for End-to-End Speech Recognition: Applications for Accent/Dialect Speech
    Ghorbani, Shahram
    Hansen, John H. L.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 762 - 774