Cross-lingual Automatic Speech Recognition Exploiting Articulatory Features

被引:0
作者
Zhan, Qingran [1 ,2 ]
Motlicek, Petr [2 ]
Du, Shixuan [1 ]
Shan, Yahui [1 ]
Ma, Sifan [1 ]
Xie, Xiang [1 ,3 ]
机构
[1] Beijing Inst Technol, Informat & Elect Inst, Beijing, Peoples R China
[2] Idiap Res Inst, Martigny, Switzerland
[3] Beijing Inst Technol, Shenzhen Res Inst, Shenzhen, Switzerland
来源
2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC) | 2019年
关键词
PHONE RECOGNITION; NEURAL-NETWORK; LANGUAGES;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Articulatory features (AFs) provide language-independent attribute by exploiting the speech production knowledge. This paper proposes a cross-lingual automatic speech recognition (ASR) based on AF methods. Various neural network (NN) architectures are explored to extract cross-lingual AFs and their performance is studied. The architectures include muti-layer perception(MLP), convolutional NN (CNN) and long short-term memory recurrent NN (LSTM). In our cross-lingual setup, only the source language (English, representing a well-resourced language) is used to train the AF extractors. AFs are then generated for the target language (Mandarin, representing an under-resourced language) using the trained extractors. The frame-classification accuracy indicates that the LSTM has an ability to perform a knowledge transfer through the robust cross-lingual AFs from well-resourced to under-resourced language. The final ASR system is built using traditional approaches (e.g. hybrid models), combining AFs with conventional MFCCs. The results demonstrate that the cross-lingual AFs improve the performance in under-resourced ASR task even though the source and target languages come from different language family. Overall, the proposed cross-lingual ASR approach provides slight improvement over the monolingual LF-MMI and cross-lingual (acoustic model adaptation-based) ASR systems.
引用
收藏
页码:1912 / 1916
页数:5
相关论文
共 50 条
[21]   An Overview of End-to-End Automatic Speech Recognition [J].
Wang, Dong ;
Wang, Xiaodong ;
Lv, Shaohe .
SYMMETRY-BASEL, 2019, 11 (08)
[22]   REFINING AUTOMATIC SPEECH RECOGNITION SYSTEM FOR OLDER ADULTS [J].
Chen, Liu ;
Asgari, Meysam .
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, :7003-7007
[23]   A speech recognition algorithm based on the features of Croatian language [J].
Peic, R .
PROCEEDINGS EC-VIP-MC 2003, VOLS 1 AND 2, 2003, :613-618
[24]   VISUAL FEATURES FOR CONTEXT-AWARE SPEECH RECOGNITION [J].
Gupta, Abhinav ;
Miao, Yajie ;
Neves, Leonardo ;
Metze, Florian .
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, :5020-5024
[25]   Support software for Automatic Speech Recognition systems targeted for non-native speech [J].
Radzikowski, Kacper ;
Yoshie, Osamu ;
Nowak, Robert .
22ND INTERNATIONAL CONFERENCE ON INFORMATION INTEGRATION AND WEB-BASED APPLICATIONS & SERVICES (IIWAS2020), 2020, :55-61
[26]   Recognition for synthesis: Automatic parameter selection for resynthesis of emotional speech from neutral speech [J].
Bulut, Murtaza ;
Lee, Sungbok ;
Narayanan, Shrikanth .
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, :4629-4632
[27]   A comprehensive survey on automatic speech recognition using neural networks [J].
Amandeep Singh Dhanjal ;
Williamjeet Singh .
Multimedia Tools and Applications, 2024, 83 :23367-23412
[28]   A comprehensive survey on automatic speech recognition using neural networks [J].
Dhanjal, Amandeep Singh ;
Singh, Williamjeet .
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (8) :23367-23412
[29]   A survey of hybrid ANN/HMM models for automatic speech recognition [J].
Trentin, E ;
Gori, M .
NEUROCOMPUTING, 2001, 37 :91-126
[30]   Towards an Automatic Recognition of Artifacts and Features in Plethysmographic Traces [J].
Breccia, Alessandro ;
Chiloiro, Marco ;
Lui, Riccardo ;
Panagiotakis, Konstantinos ;
Paterno, Gianfranco ;
Proto, Antonino ;
Taibi, Angelo ;
Zucchetta, Alberto .
APPLIED SCIENCES-BASEL, 2025, 15 (06)