共 44 条
[1]
Ibrahim T W S, A review of audio-visual speech recognition, Journal of Telecommunication, Electronic and Computer Engineering, 10, 1-4, pp. 35-40, (2018)
[2]
Su Rong-feng, Research on speech recognition sys⁃ tem under multiple influencing factors, (2020)
[3]
Tamura S, Ninomiya H, Kitaoka N, Et al., Audio-visu⁃ al speech recognition using deep bottleneck features and high-performance lipreading[C], Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, pp. 575-582, (2015)
[4]
Zeng Z, Tu J, Pianfetti B, Et al., Audio-visual affect recognition through multi-stream fused HMM for HCI [C], IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), pp. 967-972, (2005)
[5]
Wei Bin, Analysis of the integration path of symbol⁃ ism and connectionism of artificial intelligence, Study of Dialectics of Nature, 38, 2, pp. 23-29, (2022)
[6]
Zhang B, Zhu J, Su H., Toward the third generation artificial intelligence, Science China Information Sciences, 66, 2, pp. 1-19, (2023)
[7]
Jiao Li-cheng, Yang Shu-yuan, Liu Fang, Et al., Seventy years of neural networks: retrospect and pros⁃ pect, Chinese Journal of Computers, 39, 8, pp. 1697-1716, (2016)
[8]
Ivanko D, Ryumin D, Karpov A., A review of recent advances on deep learning methods for audio-visual speech recognition, Mathematics, 11, 12, (2023)
[9]
Wang D, Wang X D, Lyu S H., An overview of end-to-end automatic speech recognition, Symmetry, 11, 8, (2019)
[10]
Yu W, Zeiler S, Kolossa D., Fusing information streams in end-to-end audio-visual speech recognition [C], IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP), pp. 3430-3434, (2021)