共 23 条
- [1] [Anonymous], 2016, P AS C COMP VIS
- [2] Ba J. L., 2016, P ADV NEUR INF PROC
- [3] BREGLER C, 1994, INT CONF ACOUST SPEE, P669, DOI 10.1109/ICASSP.1994.389567
- [4] Chiu CC, 2018, 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), P4774, DOI 10.1109/ICASSP.2018.8462105
- [5] Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (01): : 30 - 42
- [6] Graves A, 2013, INT CONF ACOUST SPEE, P6645, DOI 10.1109/ICASSP.2013.6638947
- [7] Visual model structures and synchrony constraints for audio-visual speech recognition [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (03): : 1082 - 1089
- [8] He Kaiming, 2015, C COMP VIS PATT REC
- [9] Keating P.A., 1988, Phonology, V5, P275, DOI DOI 10.1017/S095267570000230X
- [10] Kingma DP, 2015, C TRACK P