共 25 条
[1]
Lin T.Y., Maire M., Belongie S., Hays J., Perona P., Ramanan D., Dollar P., Zitnick L., Microsoft COCO: Common objects in context, Computer vision—ECCV, 8693, (2014)
[2]
Xu K., Ba J., Kiros R., Cho K., Courville A., Salakhudinov R., Zemel R., Bengio Y., Show, attend and tell: Neural image caption generation with visual attention, . In: Proceedings of the 32Nd International Conference on ICML, pp. 2048-2057, (2015)
[3]
Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A.N., Kaiser L., Polosukhin I., Attention is all you need, : Advances in Neural Information Processing Systems, pp. 5998-6008, (2017)
[4]
Weston J., Bengio S., Usunier N., WSABIE: Scaling up to large vocabulary image annotation, IJCAI International Joint Conference on Artificial Intelligence, pp. 2764-2770, (2017)
[5]
Vendrov I., Kiros R., Fidler S., Urtasun R., Order-embeddings of images and language, International Conference on Learning Representations, (2016)
[6]
Gu J., Cai J., Joty S., Niu L., Wang G., Look, imagine and match: Improving textual-visual cross-modal retrieval with generative models, In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
[7]
2018., pp. 7181-7189
[8]
Chen Y., Wang J.Z., Krovetz R., Content-based image retrieval by clustering, Proceedings of the 5Th ACM SIGMM International Workshop on Multimedia Information Retrieval., pp. 193-200, (2003)
[9]
Sheikholeslami G., Chang W., Zhang A., SemQuery: semantic clustering and querying on heterogeneous features for visual data, IEEE Trans Knowl Data Eng, 14, 5, pp. 988-1002, (2002)
[10]
Smith J.R., Chang S.F., VisualSEEK: A fully automated content-based query system, Proc. 4Th Acm int’l Conf. on Multimedia, pp. 87-88, (1996)