共 51 条
- [1] Cheng Y(2022)Cross-modal graph matching network for image-text retrieval ACM Transact. Multimedia. Comp. Communicat. Appl. (TOMM) 18 1-23
- [2] Zhu X(2020)Multimodal feature fusion by relational reasoning and attention for visual question answering Informat. Fusion 55 116-126
- [3] Qian J(2019)Multi-source multi-level attention networks for visual question answering ACM Transact. Mult. Comput., Communicat., Applicat. 15 1-20
- [4] Wen F(2021)Object-difference drived graph convolutional networks for visual question answering Mult. Tools Appl. 80 16247-16265
- [5] Liu P(2022)Visual-semantic graph neural network with pose-position attentive learning for group activity recognition Neurocomputing 491 217-231
- [6] Zhang W(2019)Interpretable visual question answering by reasoning on dependency trees IEEE Transact. Pattern Anal. Mach. Intell. 43 887-901
- [7] Yu J(2017)Image captioning and visual question answering based on attributes and external knowledge IEEE Transact. Pattern Anal. Mach. Intell. 40 1367-1381
- [8] Hu H(2017)Faster r-cnn: Towards real-time object detection with region proposal networks IEEE Transact. Pattern Anal. Mach. Intell. 39 1137-1149
- [9] Hu H(2021)Research on visual question answering based on deep stacked attention network J. Phys. 1873 1-8
- [10] Qin Z(2022)Evaluation of graph convolutional networks performance for visual question answering on reasoning datasets Multimed. Tools Appl. 81 40361-40370