Wide-Slice Residual Networks for Food Recognition

被引:141
作者
Martinel, Niki [1 ]
Foresti, Than Luca [2 ]
Micheloni, Christian [1 ]
机构
[1] Univ Udine, Machine Learning Percept Lab, Udine, Italy
[2] Univ Udine, AViReS Lab, Udine, Italy
来源
2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2018) | 2018年
关键词
DIETARY ASSESSMENT;
D O I
10.1109/WACV.2018.00068
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image-based food recognition pose new challenges for mainstream computer vision algorithms. Recent works in the field focused either on hand-crafted representations or on learning these by exploiting deep neural networks (DNN). Despite the success of DNN-based works, these exploit off-the-shelf deep architectures which are not cast to the specific food classification problem. We believe that better results can be obtained if the architecture is defined with respect to an analysis of the food composition. Following such an intuition, this work introduces a new deep scheme that is designed to handle the food structure. In particular, we focus on the vertical food traits that are common to a large number of categories (i.e., 15% of the whole data in current datasets). Towards the final objective, we first introduce a slice convolution block to capture such specific information. Then, we leverage on the recent success of deep residual blocks and combine those with the sliced convolution to produce the classification score. Extensive evaluations on three benchmark datasets demonstrated that our solution has better performance than existing approaches (e.g., a top-1 accuracy of 90.27% on the Food-101 dataset).
引用
收藏
页码:567 / 576
页数:10
相关论文
共 48 条
[11]   Licensed-Assisted Access for LTE in Unlicensed Spectrum: A MAC Protocol Design [J].
Han, Shiying ;
Liang, Ying-Chang ;
Chen, Qian ;
Soong, Boon-Hee .
2016 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2016,
[12]   Food Image Recognition Using Very Deep Convolutional Networks [J].
Hassannejad, Hamid ;
Matrella, Guido ;
Ciampolini, Paolo ;
De Munari, Ilaria ;
Mordonini, Monica ;
Cagnoni, Stefano .
MADIMA'16: PROCEEDINGS OF THE 2ND INTERNATIONAL WORKSHOP ON MULTIMEDIA ASSISTED DIETARY MANAGEMENT, 2016, :41-49
[13]  
He K., 2015, ABS15020 ARXIV
[14]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[15]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[16]   Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data [J].
Hendricks, Lisa Anne ;
Venugopalan, Subhashini ;
Rohrbach, Marcus ;
Mooney, Raymond ;
Saenko, Kate ;
Darrell, Trevor .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1-10
[17]  
Howard A. G., 2013, 13125402 ARXIV
[18]   Natural Language Object Retrieval [J].
Hu, Ronghang ;
Xu, Huazhe ;
Rohrbach, Marcus ;
Feng, Jiashi ;
Saenko, Kate ;
Darrell, Trevor .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :4555-4564
[19]   Learning Visual Features from Large Weakly Supervised Data [J].
Joulin, Armand ;
van der Maaten, Laurens ;
Jabri, Allan ;
Vasilache, Nicolas .
COMPUTER VISION - ECCV 2016, PT VII, 2016, 9911 :67-84
[20]  
Kawano Yoshiyuki, 2014, MultiMedia Modeling. 20th Anniversary International Conference, MMM 2014. Proceedings: LNCS 8326, P369, DOI 10.1007/978-3-319-04117-9_38