Hierarchical Attention Network for Visually-Aware Food Recommendation

被引:66
作者
Gao, Xiaoyan [1 ]
Feng, Fuli [2 ]
He, Xiangnan [3 ]
Huang, Heyan [1 ]
Guan, Xinyu [4 ]
Feng, Chong [1 ]
Ming, Zhaoyan [2 ]
Chua, Tat-Seng [2 ]
机构
[1] Beijing Inst Technol, Beijing Engn Res Ctr High Volume Language Informa, Sch Comp, Beijing 100081, Peoples R China
[2] Natl Univ Singapore, Sch Comp, Singapore 117417, Singapore
[3] Univ Sci & Technol China, Hefei 230031, Peoples R China
[4] Xi An Jiao Tong Univ, Syst Engn Inst, Xian 710049, Peoples R China
基金
中国国家自然科学基金; 新加坡国家研究基金会;
关键词
Visualization; Recommender systems; Collaboration; Encoding; Task analysis; Feature extraction; History; Food Recommender Systems; Hierarchical Attention; Collaborative Filtering; Ingredients; Recipe Image; EAT;
D O I
10.1109/TMM.2019.2945180
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Food recommender systems play an important role in assisting users to identify the desired food to eat. Deciding what food to eat is a complex and multi-faceted process, which is influenced by many factors such as the ingredients, appearance of the recipe, the user's personal preference on food, and various contexts like what had been eaten in the past meals. This work formulates the food recommendation problem as predicting user preference on recipes based on three key factors that determine a user's choice on food, namely, 1) the user's (and other users') history; 2) the ingredients of a recipe; and 3) the descriptive image of a recipe. To address this challenging problem, this work develops a dedicated neural network-based solution Hierarchical Attention based Food Recommendation (HAFR) which is capable of: 1) capturing the collaborative filtering effect like what similar users tend to eat; 2) inferring a user's preference at the ingredient level; and 3) learning user preference from the recipe's visual images. To evaluate our proposed method, this work constructs a large-scale dataset consisting of millions of ratings from AllRecipes.com. Extensive experiments show that our method outperforms several competing recommender solutions like Factorization Machine and Visual Bayesian Personalized Ranking with an average improvement of 12%, offering promising results in predicting user preference on food.
引用
收藏
页码:1647 / 1659
页数:13
相关论文
共 58 条
[1]   Grab, Pay, and Eat: Semantic Food Detection for Smart Restaurants [J].
Aguilar, Eduardo ;
Remeseiro, Bealriz ;
Bolanos, Marc ;
Radeva, Petia .
IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (12) :3266-3275
[2]  
[Anonymous], 2015, RECSYS 15, DOI DOI 10.1145/2792838.2796554
[3]  
[Anonymous], P INT AAAI C WEB SOC
[4]  
[Anonymous], WHO TECHNICAL REPORT
[5]  
Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473
[6]  
Bossard L, 2014, LECT NOTES COMPUT SC, V8694, P446, DOI 10.1007/978-3-319-10599-4_29
[7]  
Cai XY, 2018, AAAI CONF ARTIF INTE, P5747
[8]   Deep Understanding of Cooking Procedure for Cross-modal Recipe Retrieval [J].
Chen, Jing-Jing ;
Ngo, Chong-Wah ;
Feng, Fu-Li ;
Chua, Tat-Seng .
PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, :1020-1028
[9]   Cross-modal Recipe Retrieval with Rich Food Attributes [J].
Chen, Jing-Jing ;
Ngo, Chong-Wah ;
Chua, Tat-Seng .
PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, :1771-1779
[10]   Deep-based Ingredient Recognition for Cooking Recipe Retrieval [J].
Chen, Jingjing ;
Ngo, Chong-Wah .
MM'16: PROCEEDINGS OF THE 2016 ACM MULTIMEDIA CONFERENCE, 2016, :32-41