Cross-lingual Automatic Speech Recognition Exploiting Articulatory Features

被引：0

作者：

Zhan, Qingran ^{[1
,2
]}

Motlicek, Petr ^{[2
]}

Du, Shixuan ^{[1
]}

Shan, Yahui ^{[1
]}

Ma, Sifan ^{[1
]}

Xie, Xiang ^{[1
,3
]}

机构：

[1] Beijing Inst Technol, Informat & Elect Inst, Beijing, Peoples R China

[2] Idiap Res Inst, Martigny, Switzerland

[3] Beijing Inst Technol, Shenzhen Res Inst, Shenzhen, Switzerland

来源：

2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC) | 2019年

关键词：

PHONE RECOGNITION; NEURAL-NETWORK; LANGUAGES;

D O I：

暂无

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Articulatory features (AFs) provide language-independent attribute by exploiting the speech production knowledge. This paper proposes a cross-lingual automatic speech recognition (ASR) based on AF methods. Various neural network (NN) architectures are explored to extract cross-lingual AFs and their performance is studied. The architectures include muti-layer perception(MLP), convolutional NN (CNN) and long short-term memory recurrent NN (LSTM). In our cross-lingual setup, only the source language (English, representing a well-resourced language) is used to train the AF extractors. AFs are then generated for the target language (Mandarin, representing an under-resourced language) using the trained extractors. The frame-classification accuracy indicates that the LSTM has an ability to perform a knowledge transfer through the robust cross-lingual AFs from well-resourced to under-resourced language. The final ASR system is built using traditional approaches (e.g. hybrid models), combining AFs with conventional MFCCs. The results demonstrate that the cross-lingual AFs improve the performance in under-resourced ASR task even though the source and target languages come from different language family. Overall, the proposed cross-lingual ASR approach provides slight improvement over the monolingual LF-MMI and cross-lingual (acoustic model adaptation-based) ASR systems.

引用

页码：1912 / 1916

页数：5

共 50 条

[1] Exploiting Morpheme and Cross-lingual Knowledge to Enhance Mongolian Named Entity Recognition
Zhang, Songming
Zhang, Ying
Chen, Yufeng
Wu, Du
Xu, Jinan
Liu, Jian
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (05)
[2] Domain-Adversarial Based Model with Phonological Knowledge for Cross-Lingual Speech Recognition
Zhan, Qingran
Xie, Xiang
Hu, Chenguang
Zuluaga-Gomez, Juan
Wang, Jing
Cheng, Haobo
ELECTRONICS, 2021, 10 (24)
[3] Cross-lingual text similarity exploiting neural machine translation models
Seki, Kazuhiro
JOURNAL OF INFORMATION SCIENCE, 2021, 47 (03) : 404 - 418
[4] Self-organizing speech recognition that processes acoustic and articulatory features
Hesdras O. Viana
Aluízio F. R. Araújo
Danilo S. Barbosa
Multimedia Tools and Applications, 2024, 83 : 39169 - 39195
[5] Self-organizing speech recognition that processes acoustic and articulatory features
Viana, Hesdras O.
Araujo, Aluizio F. R.
Barbosa, Danilo S.
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (13) : 39169 - 39195
[6] A cross-lingual adaptation approach for rapid development of speech recognizers for learning disabled users
Bohac, Marek
Kucharova, Michaela
Callejas, Zoraida
Nouza, Jan
Cerva, Petr
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2014, : 1 - 13
[7] Articulatory and excitation source features for speech recognition in read, extempore and conversation modes
Manjunath, K. E.
Rao, K. Sreenivasa
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2016, 19 (01) : 121 - 134
[8] Robust phone set mapping using decision tree clustering for cross-lingual phone recognition
Sim, Khe Chai
Li, Haizhou
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4309 - 4312
[9] Evaluating cross-lingual textual similarity on dictionary alignment problem
Sever, Yigit
Ercan, Gonenc
LANGUAGE RESOURCES AND EVALUATION, 2020, 54 (04) : 1059 - 1078
[10] Emotion Detection in Cross-Lingual Text Based on Bidirectional LSTM
Ren, Han
Wan, Jing
Ren, Yafeng
SECURITY WITH INTELLIGENT COMPUTING AND BIG-DATA SERVICES, 2020, 895 : 838 - 845

← 1 2 3 4 5 →