共 267 条
- [1] Abdel-Hamid O, 2013, INT CONF ACOUST SPEE, P7942, DOI 10.1109/ICASSP.2013.6639211
- [2] Deep Lip Reading: a comparison of models and an online application [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3514 - 3518
- [3] Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4971 - 4980
- [4] Aila T., 2018, INT C LEARN REPR ICL
- [5] Alberti C., 2019, INT C MACH LEARN COM
- [6] Anastasopoulos A., 2019, ARXIV190302930
- [7] Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6077 - 6086
- [8] Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 3674 - 3683
- [9] Andreas J., 2016, C N AM CHAPT ASS COM
- [10] [Anonymous], 2018, IEEE T NEUR NET LEAR, DOI DOI 10.1109/TNNLS.2018.2817340