Research on speech recognition technology based on Wisdom Classroom

被引：0

作者：

Luo, Kai ^{[1
]}

Zhu, Guibin ^{[1
]}

Luo, Rongjian ^{[1
]}

Zhou, Youwei ^{[1
]}

Huang, Leping ^{[1
]}

Wu, Yaoyue ^{[1
]}

机构：

[1] Army Engn Univ PLA, Commun NCO Sch, Nanjing, Jiangsu, Peoples R China

来源：

2022 IEEE 6TH ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC) | 2022年

关键词：

Wisdom classroom; speech recognition; CNN-BiGRU; CTC-beam-search;

D O I：

10.1109/IAEAC54830.2022.9929727

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Nowadays, Wisdom classroom has many image recognition functions, but there are still large gaps and shortcomings in speech recognition. How to give real-time feedback to teachers and students on the discussion records and knowledge notes taught by teachers in class. Solving this problem requires real-time speech recognition. The number of training samples in the system is small. Through the integration training of multiple data sets. The improved CNN-BiGRU algorithm is used to fully extract speech features. Through CTC-beam-search get the optimal recognition result.

引用

页码：809 / 813

页数：5

共 9 条

[1] Agarap A. F., 2018, arXiv, DOI [10.48550/arXiv.1803.08375, DOI 10.48550/ARXIV.1803.08375]
[2] [Anonymous], 1997, CMUSEI97TR013
[3] Cho KYHY, 2014, Arxiv, DOI arXiv:1406.1078
[4] Fu R, 2016, 2016 31ST YOUTH ACADEMIC ANNUAL CONFERENCE OF CHINESE ASSOCIATION OF AUTOMATION (YAC), P324, DOI 10.1109/YAC.2016.7804912
[5] Recent advances in convolutional neural networks
Gu, Jiuxiang
Wang, Zhenhua
Kuen, Jason
Ma, Lianyang
Shahroudy, Amir
Shuai, Bing
Liu, Ting
Wang, Xingxing
Wang, Gang
Cai, Jianfei
Chen, Tsuhan
[J]. PATTERN RECOGNITION, 2018, 77 : 354 - 377
[6] SPEECH RECOGNITION - A MODEL AND A PROGRAM FOR RESEARCH
HALLE, M
STEVENS, K
[J]. IRE TRANSACTIONS ON INFORMATION THEORY, 1962, 8 (02): : 155 - &
[7] Kim S, 2017, INT CONF ACOUST SPEE, P4835, DOI 10.1109/ICASSP.2017.7953075
[8] Saquib Zia., 2011, INT J HYBRID INFORM, V4
[9] Sundermeyer M, 2012, 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, P194

← 1 →