共 23 条
[1]
Anderson P, He X, Buehler C, Et al., Bottom-up and top-down attention for image captioning and VQA, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5676-5685, (2018)
[2]
Lu J, Xiong C, Parikh D, Et al., Knowing when to look: adaptive attention via a visual sentinel for image captioning, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6077-6086, (2018)
[3]
Yang Z, Yuan Y, Wu Y, Et al., Review networks for caption generation, Advances in Neural Information Processing Systems, pp. 2361-2369, (2016)
[4]
Xu K, Ba J, Kiros R, Et al., Show, attend and tell: neural image caption generation with visual attention, (2015)
[5]
Lin T Y, Maire M, Belongie S, Et al., Microsoft COCO: common objects in context, European Conference on Computer Vision, pp. 740-755, (2016)
[6]
Sennrich R, Haddow B, Birch A., Neural machine translation of rare words with subword units, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 1715-1725, (2016)
[7]
Vaswani A, Shazeer N, Parmar N, Et al., Attention is all you need, Advances in Neural Information Processing Systems, pp. 5998-6008, (2017)
[8]
Devlin J, Chang M W, Lee K, Et al., Bert: pre-training of deep bidirectional transformers for language understanding, (2018)
[9]
Young T, Hazarika D, Poria S, Et al., Recent trends in deep learning based natural language processing[J], IEEE Computational Intelligence Magazine, 13, 3, pp. 55-75, (2018)
[10]
Young T, Hazarika D, Poria S, Et al., Recent trends in deep learning based natural language processing, Proceedings of 2017 IEEE International Conference on Computer Vision, pp. 55-75, (2017)