共 36 条
- [1] Vinyals O., Toshev A., Bengio S., Et al., Show and tell: a neural image caption generator, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3156-3164, (2015)
- [2] Xu K., Ba J., Kiros R., Et al., Show, attend and tell: neural image caption generation with visual attention, Proceedings of International Conference on Machine Learning, pp. 2048-2057, (2015)
- [3] Krizhevsky A., Sutskever I., Hinton G.E., Imagenet classification with deep convolutional neural networks, Proceedings of the 25th International Conference on Neural Information Processing Systems, 1, pp. 1097-1105, (2012)
- [4] Simonyan K., Zisserman A., Very deep convolutional networks for large-scale image recognition
- [5] He K.M., Zhang X.Y., Ren S.Q., Et al., Deep residual learning for image recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, (2016)
- [6] Bahdanau D., Cho K., Bengio Y., Neural machine translation by jointly learning to align and translate
- [7] Sun F., Qin K., Sun W., Et al., Image saliency detection based on region merging, Journal of Computer-Aided Design & Computer Graphics, 28, 10, pp. 1679-1687, (2016)
- [8] Gao S., Zhang L., Li C., Et al., Image saliency detection via graph representation with fusing low-level and high-level features, Journal of Computer-Aided Design & Computer Graphics, 28, 3, pp. 420-426, (2016)
- [9] You Q.Z., Jin H.L., Wang Z.W., Et al., Image captioning with semantic attention, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4651-4659, (2016)
- [10] Gu J.X., Cai J.F., Wang G., Et al., Stack-captioning: coarse-to-fine learning for image captioning, Proceedings of AAAI Conference on Artificial Intelligence, pp. 6837-6844, (2018)