Comparison of DCT and Autoencoder-based Features for DNN-HMM Multimodal Silent Speech Recognition

被引：0

作者：

Liu, Licheng ^{[1
]}

Ji, Yan ^{[1
]}

Wang, Hongcui ^{[1
]}

Denby, Bruce ^{[1
]}

机构：

[1] Tianjin Univ, Sch Comp Sci & Technol, Tianjin, Peoples R China

来源：

2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | 2016年

关键词：

silent speech recognition; feature extraction; autoencoder; non-acoustic feature;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Hidden Markov Model and Deep Neural Network-Hidden Markov Model speech recognition performance for a portable ultrasound + video multimodal silent speech interface is investigated using Discrete Cosine Transform and Deep Auto Encoder-based features with a range of dimensionalities. Experimental results show that the two types of features achieve similar Word Error Rate, but that the autoencoder features maintain good performance even for very low-dimension feature vectors, demonstrating potential as a very compact representation of the information in multimodal silent speech data. It is also shown for the first time that the Deep Network/ Markov approach, which has been demonstrated to be beneficial for acoustic speech recognition and for articulatory sensor-based silent speech, improves the silent speech recognition performance for video-based silent speech recognition as well.

引用

页数：5

共 50 条

[1] DNN-HMM based Automatic Speech Recognition for HRI Scenarios
Novoa, Jose
Wuth, Jorge
Pablo Escudero, Juan
Fredes, Josue
Mahu, Rodrigo
Becerra Yoma, Nestor
HRI '18: PROCEEDINGS OF THE 2018 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, 2018, : 150 - 159
[2] Comparison of syllable-based and phoneme-based DNN-HMM in Japanese Speech Recognition
Seki, Hiroshi
Yamamoto, Kazumasa
Nakagawa, Seiichi
2014 INTERNATIONAL CONFERENCE OF ADVANCED INFORMATICS: CONCEPT, THEORY AND APPLICATION (ICAICTA), 2014, : 249 - 254
[3] Research on Speech Accurate Recognition Technology Based on Deep Learning DNN-HMM
Xia Wanyu
Qiu Wu
Feng Xiancheng
MIPPR 2019: PATTERN RECOGNITION AND COMPUTER VISION, 2020, 11430
[4] Contaminated speech training methods for robust DNN-HMM distant speech recognition
Ravanelli, Mirco
Omologo, Maurizio
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 756 - 760
[5] Multilingual Approach to Joint Speech and Accent Recognition with DNN-HMM Framework
Peng, Yizhou
Zhang, Jicheng
Zhang, Haobo
Xu, Haihua
Huang, Hao
Li, Sheng
Chng, Eng Siong
2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 1043 - 1048
[6] Labeling Unsegmented Sequence Data with DNN-HMM and Its Application for Speech Recognition
Li, Xiangang
Wu, Xihong
2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 10 - 14
[7] Phonotactic Language Recognition Based on DNN-HMM Acoustic Model
Liu, Wei-Wei
Cai, Meng
Yuan, Hua
Shi, Xiao-Bei
Zhang, Wei-Qiang
Liu, Jia
2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 153 - +
[8] Syllable based DNN-HMM Cantonese Speech-to-Text System
Wong, Timothy
Li, Claire W. Y.
Lam, Sam
Chiu, Billy
Lu, Qin
Li, Minglei
Xiong, Dan
Yu, Roy S.
Ng, Vincent T. Y.
LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 3856 - 3862
[9] Large Vocabulary Children's Speech Recognition with DNN-HMM and SGMM Acoustic Modeling
Giuliani, Diego
BabaAli, Bagher
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1635 - 1639
[10] Hybrid Deep Neural Network - Hidden Markov Model (DNN-HMM) Based Speech Emotion Recognition
Li, Longfei
Zhao, Yong
Jiang, Dongmei
Zhang, Yanning
Wang, Fengna
Gonzalez, Isabel
Valentin, Enescu
Sahli, Hichem
2013 HUMAINE ASSOCIATION CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2013, : 312 - 317

← 1 2 3 4 5 →