Automatic Speech Recognition in Different Languages Using High-Density Surface Electromyography Sensors

被引:5
作者
Zhu, Mingxing [1 ,2 ,3 ]
Huang, Zhen [4 ]
Wang, Xiaochen [1 ,2 ,3 ]
Wang, Xin [1 ,2 ,3 ]
Wang, Cheng [1 ,2 ,3 ]
Zhang, Haoshi [1 ,2 ,3 ]
Zhao, Guoru [5 ,6 ]
Chen, Shixiong [5 ,6 ]
Li, Guanglin [5 ,6 ]
机构
[1] Chinese Acad Sci, Shenzhen Inst Adv Technol, Key Lab Human Machine Intelligence Synergy Syst, Shenzhen 518055, Peoples R China
[2] Joint Lab Human Machine Intelligence Synergy Syst, Shenzhen 518055, Peoples R China
[3] Univ Chinese Acad Sci, Shenzhen Coll Adv Technol, Shenzhen 518055, Peoples R China
[4] Guangzhou Panyu Cent Hosp, Dept Rehabil Med, Guangzhou 511400, Peoples R China
[5] Chinese Acad Sci, Shenzhen Inst Adv Technol, CAS Key Lab Human Machine IntelligenceSynergy Sys, Shenzhen 518055, Peoples R China
[6] Joint Lab Human Machine Intelligence Synergy Syst, Shenzhen 518055, Peoples R China
基金
中国国家自然科学基金;
关键词
Sensors; Muscles; Neck; Sensor systems; Speech recognition; Face recognition; Sensor phenomena and characterization; Automatic speech recognition; high-density surface electromyography; sensors placement; sequential forward selection algorithm; PHONATION; SELECTION;
D O I
10.1109/JSEN.2020.3037061
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Automatic speech recognition (ASR) based on surface electromyography (sEMG) sensors is an important technology converting electrical signals into computer-readable textual messages, which can overcome the limitation of acoustic sensors that are easily contaminated by environmental noises. However, current placements of sEMG sensors mainly depend on the experimenter's experience, which could miss important information about the major muscular activities and lead to the decline of classification performance. In this study, 120 closely-spaced sEMG sensors were utilized to collect high-density sEMG signals for recognizing ten digits in English and Chinese. The linear discriminant analysis classifier was used to classify the speaking tasks, and the sequential forward selection algorithm was utilized for analyzing the optimal position of the sensors. The results showed that the HD sEMG energy maps could help visualize the dynamic muscle activities during the speaking process, and significantly different muscular contraction patterns were observed for different speaking tasks. The classification accuracies when using the facial sensors were significantly lower than those on the neck, although with the same number of sensors. Moreover, the classification rates could be higher than 90% with only 15 optimally selected sensors that were mainly distributed on the neck instead of the face. This study suggests that the neck muscles could be the main contributor, and more sEMG sensors should be placed on the neck to improve the ASR performance. The findings of this study could provide valuable clues for the development of a practical sEMG-based speech recognition system, especially for patients with speaking disorders.
引用
收藏
页码:14155 / 14167
页数:13
相关论文
共 56 条
[1]  
Abdullah R, 2019, PERTANIKA J SCI TECH, V27, P737
[2]   Amplitude indicators and spatial aliasing in high density surface electromyography recordings [J].
Afsharipour, B. ;
Ullah, K. ;
Merletti, R. .
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2015, 22 :170-179
[3]  
Ai Q, 2019, CHIN AUTOM CONGR, P3347, DOI [10.1109/CAC48633.2019.8996926, 10.1109/cac48633.2019.8996926]
[4]  
Apps J., 2012, VOICE SPEAKING SKILL, P82
[5]   Upper Arm Motion High-Density sEMG Recognition Optimization Based on Spatial and Time-Frequency Domain Features [J].
Bai, Dianchun ;
Chen, Shutian ;
Yang, Junyou .
JOURNAL OF HEALTHCARE ENGINEERING, 2019, 2019
[6]   Small-vocabulary speech recognition using surface electromyography [J].
Betts, Bradley J. ;
Binsted, Kim ;
Jorgensen, Charles .
INTERACTING WITH COMPUTERS, 2006, 18 (06) :1242-1259
[7]   Single-Channel sEMG Dictionary Learning Classification of Ingestive Behavior on Cows [J].
Campos, Daniel Prado ;
Lazzaretti, Andre Eugenio ;
Bertotti, Fabio Luiz ;
Gomes, Otavio Augusto ;
Gualberto Hill, Joao Ari ;
Finkler da Silveira, Andre Luis ;
Abatti, Paulo Jose .
IEEE SENSORS JOURNAL, 2020, 20 (13) :7199-7207
[8]   A Novel Phonology- and Radical-Coded Chinese Sign Language Recognition Framework Using Accelerometer and Surface Electromyography Sensors [J].
Cheng, Juan ;
Chen, Xun ;
Liu, Aiping ;
Peng, Hu .
SENSORS, 2015, 15 (09) :23303-23324
[9]  
Das Paromita, 2020, Computational Advancement in Communication Circuits and Systems. Proceedings of ICCACCS 2018. Lecture Notes in Electrical Engineering (LNEE 575), P217, DOI 10.1007/978-981-13-8687-9_20
[10]   Silent speech interfaces [J].
Denby, B. ;
Schultz, T. ;
Honda, K. ;
Hueber, T. ;
Gilbert, J. M. ;
Brumberg, J. S. .
SPEECH COMMUNICATION, 2010, 52 (04) :270-287