MULTI-LABEL IMAGE RECOGNITION WITH JOINT CLASS-AWARE MAP DISENTANGLING AND LABEL CORRELATION EMBEDDING

被引：36

作者：

Chen, Zhao-Min ^{[1
,2
]}

Wei, Xiu-Shen ^{[2
]}

Jin, Xin ^{[2
]}

Guo, Yanwen ^{[1
,3
]}

机构：

[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Jiangsu, Peoples R China

[2] Megvii Technol, Megvii Res Nanjing, Nanjing, Jiangsu, Peoples R China

[3] Sci & Technol Informat Syst Engn Lab, Nanjing, Jiangsu, Peoples R China

来源：

2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME) | 2019年

基金：

国家重点研发计划; 中国国家自然科学基金;

关键词：

Multi-label image recognition; label correlation; CNNs; class-aware disentangled maps (CADMs);

D O I：

10.1109/ICME.2019.00113

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Multi-label image recognition is a fundamental but challenging computer vision task. Great progress has been achieved by exploring the label correlation among these multiple labels which is the most crucial issue for multi-label recognition. In this paper, we propose a unified deep learning framework to jointly disentangle class-specific maps corresponding to discriminative category-wise information and then evaluate the label co-occurrence of these maps. Specifically, after obtaining the general deep image features and conducting multi-label classification, we employ the classification weights to reform the feature maps into class-aware disentangled maps (CADMs). Then, based on CADMs, we first transfer them into label vectors and then formulate the label correlation dependency from an embedding perspective. The whole model is driven by both the classification loss and the label correlation embedding loss, which is end-to-end trainable with only image-level supervisions. Extensive quantitative results of two benchmark multi-label image datasets show our model consistently outperforms other competing methods by a large margin. Meanwhile, qualitative analyses also demonstrate our model can effectively capture relatively pure class-aware maps and model label correlation dependency as well.

引用

页码：622 / 627

页数：6

共 27 条

[1]

[Anonymous], PROC CVPR IEEE

[2]

[Anonymous], 2019, AAAI C ARTIFICIAL IN

[3]

[Anonymous], 2017, CVPR

[4]

[Anonymous], 2009, P ACM INT C IM VID R

[5]

[Anonymous], IEEE TPAMI

[6]

[Anonymous], 2016, NIPS 16 P 30 INT C N, DOI DOI 10.5555/3157096.3157304

[7]

[Anonymous], 2018, P AS C COMP VIS

[8] Representation Learning: A Review and New Perspectives [J].

Bengio, Yoshua ;

Courville, Aaron ;

Vincent, Pascal .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1798-1828

[9]

Chen SF, 2018, AAAI CONF ARTIF INTE, P6714

[10] Multi-Evidence Filtering and Fusion for Multi-Label Classification, Object Detection and Semantic Segmentation Based onWeakly Supervised Learning [J].

Ge, Weifeng ;

Yang, Sibei ;

Yu, Yizhou .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :1277-1286

← 1 2 3 →