共 89 条
[1]
RASIWASIA N, COSTA PEREIRA J, COVIELLO E, Et al., A new approach to cross-modal multimedia retrieval[C], Proceedings of the 18th ACM International Conference on Multimedia, pp. 251-260, (2010)
[2]
LECUN Y, HINTON G., Deep learning[J], Nature, 521, 7553, (2015)
[3]
FROME A L, CORRADO G S, SHLENS J B, Et al., DeViSE:a deep visual-semantic embedding model, Proceedings of NIPS, (2013)
[4]
ANDREW G, ARORA R, Et al., Deep canonical correlation analysis[C], International Conference on International Conference on Machine Learning, (2013)
[5]
PENG Y, YUAN Y., Modality-specific cross-modal similarity measurement with recurrent attention network[J], IEEE Transactions on Image Processing, 27, 11, pp. 5585-5599, (2018)
[6]
CORTES C, VAPNIK V., Support-vector networks[J], Machine Learning, 20, 3, pp. 273-297, (1995)
[7]
MORADE S S, PATNAIK S., Comparison of classifiers for lip reading with CUAVE and TULIPS database[J], Optik, 126, 24, pp. 5753-5761, (2015)
[8]
NGIAM J, KHOSLA A, KIM M, Et al., Multimodal deep learning, Proceedings of ICML, (2011)
[9]
SRIVASTAVA N, SALAKHUTDINOV R., Multimodal learning with deep boltzmann machines[J], Journal of Machine Learning Research, 15, 1, pp. 2949-2980, (2012)
[10]
VASWANI A, SHAZEER N, PARMAR N, Et al., Attention is all you need, Advances in Neural Information Processing Systems, (2017)