Structure-Aware Deep Learning for Product Image Classification

被引:30
作者
Chen, Zhineng [1 ]
Al, Shanshan [2 ]
Jia, Caiyan [2 ]
机构
[1] Chinese Acad Sci, Inst Automat, 95 Zhongguancun East Rd, Beijing 100190, Peoples R China
[2] Beijing Jiaotong Univ, 3 Shangyuancun Rd, Beijing 100044, Peoples R China
基金
中国国家自然科学基金;
关键词
Image classification; category hierarchy; convolutional neural network; multi-class regression; multi-task learning; ASSOCIATION;
D O I
10.1145/3231742
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Automatic product image classification is a task of crucial importance with respect to the management of online retailers. Motivated by recent advancements of deep Convolutional Neural Networks (CNN) on image classification, in this work we revisit the problem in the context of product images with the existence of a predefined categorical hierarchy and attributes, aiming to leverage the hierarchy and attributes to improve classification accuracy. With these structure-aware clues, we argue that more advanced deep models could be developed beyond the flat one-versus-all classification performed by conventional CNNs. To this end, novel efforts of this work include a salient-sensitive CNN that gazes into the product foreground by inserting a dedicated spatial attention module; a multiclass regression-based refinement that is expected to predict more accurately by merging prediction scores from multiple preceding CNNs, each corresponding to a distinct classifier in the hierarchy; and a multitask deep learning architecture that effectively explores correlations among categories and attributes for categorical label prediction. Experimental results on nearly 1 million real-world product images basically validate the effectiveness of the proposed efforts individually and jointly, from which performance gains are observed.
引用
收藏
页数:20
相关论文
共 49 条
[1]   Large-Scale Product Classification via Spatial Attention Based CNN Learning and Multi-class Regression [J].
Ai, Shanshan ;
Jia, Caiyan ;
Chen, Zhineng .
MULTIMEDIA MODELING (MMM 2017), PT I, 2017, 10132 :176-188
[2]  
[Anonymous], DEEP INSIDE CONVOLUT
[3]  
[Anonymous], PROC CVPR IEEE
[4]  
[Anonymous], MULTIMEDIA SYSTEMS
[5]  
[Anonymous], 2017, COMMUN ACM, DOI DOI 10.1145/3065386
[6]  
[Anonymous], PROC CVPR IEEE
[7]  
[Anonymous], P 18 ACM INT C MULT
[8]  
[Anonymous], IEEE T PATTERN ANAL
[9]  
[Anonymous], 2015, COMPUTER SCI
[10]  
[Anonymous], 2015, PROC 28 INT C NEURAL