DRNet: Towards fast, accurate and practical dish recognition

被引:4
作者
Chu BinFei [1 ,2 ]
Zhong BiNeng [1 ]
Zhang ZiKai [1 ,2 ]
Liu Xin [3 ]
Tang ZhenJun [1 ]
Li XianXian [1 ]
SiYuan, Cheng [1 ,2 ]
机构
[1] Guangxi Normal Univ, Guangxi Key Lab Multisource Informat Min & Secur, Guilin 541004, Peoples R China
[2] Huaqiao Univ, Dept Comp Sci & Technol, Xiamen 361021, Peoples R China
[3] Seetatech Technol, Nanjing 211800, Peoples R China
基金
中国国家自然科学基金;
关键词
neural network acceleration; neural network quantization; object detection; reidentification; dish recognition;
D O I
10.1007/s11431-021-1903-4
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Existing algorithms of dish recognition mainly focus on accuracy with predefined classes, thus limiting their application scope. In this paper, we propose a practical two-stage dish recognition framework (DRNet) that yields a tradeoff between speed and accuracy while adapting to the variation in class numbers. In the first stage, we build an arbitrary-oriented dish detector (AODD) to localize dish position, which can effectively alleviate the impact of background noise and pose variations. In the second stage, we propose a dish reidentifier (DReID) to recognize the registered dishes to handle uncertain categories. To further improve the accuracy of DRNet, we design an attribute recognition (AR) module to predict the attributes of dishes. The attributes are used as auxiliary information to enhance the discriminative ability of DRNet. Moreover, pruning and quantization are processed on our model to be deployed in embedded environments. Finally, to facilitate the study of dish recognition, a well-annotated dataset is established. Our AODD, DReID, AR, and DRNet run at about 14, 25, 16, and 5 fps on the hardware RKNN 3399 pro, respectively.
引用
收藏
页码:2651 / 2661
页数:11
相关论文
共 37 条
[1]   Grab, Pay, and Eat: Semantic Food Detection for Smart Restaurants [J].
Aguilar, Eduardo ;
Remeseiro, Bealriz ;
Bolanos, Marc ;
Radeva, Petia .
IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (12) :3266-3275
[2]  
[Anonymous], ADV NEUR IN
[3]  
Banner R, 2019, ADV NEUR IN, V32
[4]  
Bossard L, 2014, LECT NOTES COMPUT SC, V8694, P446, DOI 10.1007/978-3-319-10599-4_29
[5]   AdderNet: Do We Really Need Multiplications in Deep Learning? [J].
Chen, Hanting ;
Wang, Yunhe ;
Xu, Chunjing ;
Shi, Boxin ;
Xu, Chao ;
Tian, Qi ;
Xu, Chang .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :1465-1474
[6]   Deep-based Ingredient Recognition for Cooking Recipe Retrieval [J].
Chen, Jingjing ;
Ngo, Chong-Wah .
MM'16: PROCEEDINGS OF THE 2016 ACM MULTIMEDIA CONFERENCE, 2016, :32-41
[7]   PFID: PITTSBURGH FAST-FOOD IMAGE DATASET [J].
Chen, Mei ;
Dhingra, Kapil ;
Wu, Wen ;
Yang, Lei ;
Sukthankar, Rahul ;
Yang, Jie .
2009 16TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-6, 2009, :289-+
[8]   Siamese Box Adaptive Network for Visual Tracking [J].
Chen, Zedu ;
Zhong, Bineng ;
Li, Guorong ;
Zhang, Shengping ;
Ji, Rongrong .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :6667-6676
[9]   Food Recognition: A New Dataset, Experiments, and Results [J].
Ciocca, Gianluigi ;
Napoletano, Paolo ;
Schettini, Raimondo .
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2017, 21 (03) :588-598
[10]   CenterNet: Keypoint Triplets for Object Detection [J].
Duan, Kaiwen ;
Bai, Song ;
Xie, Lingxi ;
Qi, Honggang ;
Huang, Qingming ;
Tian, Qi .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6568-6577