Multimodal-adaptive hierarchical network for multimedia sequential recommendation

被引:7
作者
Han, Tengyue [1 ]
Niu, Shaozhang [1 ]
Wang, Pengfei [2 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing Key Lab Intelligent Telecommun Software &, Beijing 100876, Peoples R China
[2] Beijing Univ Posts & Telecommun, Sch Comp Sci, Beijing 100876, Peoples R China
关键词
Multimedia; Multimodal; Sequential recommendation; Multimodal-adaptive;
D O I
10.1016/j.patrec.2021.08.023
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recommender system has a pivotal role in electronic economy especially for the online shopping platforms. Studies over the past two decades have proved that exploiting the inherent properties of items contributes a lot to the accuracy of multimedia sequential recommendation. There is no doubt that multimedia information including images and texts of a product have an impact on user's purchase decision. However, modeling user's dynamic preferences for multimodal (visual and textual in this paper) information over time is still a challenging problem. To solve this problem, we propose a Multimodal-Adaptive Hierarchical Network (MAHN for short) for multimedia sequential recommendation, which includes a hierarchical recurrent neural network and an information modulation module between the hierarchical structure. Specifically, the hierarchical recurrent neural network achieves the re-selection of multimodal information from the first layer to the second layer, the information modulation module realizes the selection of each modal information at time step t based on the previous time steps. Finally, to improve the generalization ability of our model, we adopt the multi-task training style to jointly optimize BPR loss and reconstruction loss of multimodal information. Experiments are conducted on two real world public datasets, and the results demonstrate that our model outperforms the other methods. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:10 / 17
页数:8
相关论文
共 27 条
[1]  
Bao Y, 2014, AAAI CONF ARTIF INTE, P2
[2]  
Bogina V, 2017, RECTEMP RECSYS, V1922, P57
[3]   Personalized Fashion Recommendation with Visual Explanations based on Multimodal Attention Network [J].
Chen, Xu ;
Chen, Hanxiong ;
Xu, Hongteng ;
Zhang, Yongfeng ;
Cao, Yixin ;
Qin, Zheng ;
Zha, Hongyuan .
PROCEEDINGS OF THE 42ND INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '19), 2019, :765-774
[4]   Sequential Recommendation with User Memory Networks [J].
Chen, Xu ;
Xu, Hongteng ;
Zhang, Yongfeng ;
Tang, Jiaxi ;
Cao, Yixin ;
Qin, Zheng ;
Zha, Hongyuan .
WSDM'18: PROCEEDINGS OF THE ELEVENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2018, :108-116
[5]   Aspect-Aware Latent Factor Model: Rating Prediction with Ratings and Reviews [J].
Cheng, Zhiyong ;
Ding, Ying ;
Zhu, Lei ;
Kankanhalli, Mohan .
WEB CONFERENCE 2018: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW2018), 2018, :639-648
[6]   MV-RNN: A Multi-View Recurrent Neural Network for Sequential Recommendation [J].
Cui, Qiang ;
Wu, Shu ;
Liu, Qiang ;
Zhong, Wen ;
Wang, Liang .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (02) :317-331
[7]   Visual Turing test for computer vision systems [J].
Geman, Donald ;
Geman, Stuart ;
Hallonquist, Neil ;
Younes, Laurent .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2015, 112 (12) :3618-3623
[8]  
He RN, 2016, AAAI CONF ARTIF INTE, P144
[9]   Neural Factorization Machines for Sparse Predictive Analytics [J].
He, Xiangnan ;
Chua, Tat-Seng .
SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, :355-364
[10]  
He Xiangnan, 2015, P 24 ACM INT C INF K, P1661, DOI DOI 10.1145/2806416.2806504