Attention Convolutional Binary Neural Tree for Fine-Grained Visual Categorization

被引:165
作者
Ji, Ruyi [1 ,2 ]
Wen, Longyin [3 ]
Zhang, Libo [1 ]
Du, Dawei [4 ]
Wu, Yanjun [1 ]
Zhao, Chen [1 ]
Liu, Xianglong [5 ]
Huang, Feiyue [6 ]
机构
[1] ISCAS, State Key Lab Comp Sci, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
[3] JD Finance Amer Corp, Mountain View, CA USA
[4] SUNY Albany, Albany, NY 12222 USA
[5] Beihang Univ, Beijing, Peoples R China
[6] Tencent Youtu Lab, Beijing, Peoples R China
来源
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020) | 2020年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/CVPR42600.2020.01048
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fine-grained visual categorization (FGVC) is an important but challenging task due to high intra-class variances and low inter-class variances caused by deformation, occlusion, illumination, etc. An attention convolutional binary neural tree is presented to address those problems for weakly supervised FGVC. Specifically, we incorporate convolutional operations along edges of the tree structure, and use the routing functions in each node to determine the root-to-leaf computational paths within the tree. The final decision is computed as the summation of the predictions from leaf nodes. The deep convolutional operations learn to capture the representations of objects, and the tree structure characterizes the coarse-to-fine hierarchical feature learning process. In addition, we use the attention transformer module to enforce the network to capture discriminative features. Several experiments on the CUB200-2011, Stanford Cars and Aircraft datasets demonstrate that our method performs favorably against the state-of-the-arts. Code can be found at https://isrc.iscas.ac.cn/gitlab/research/acnet.
引用
收藏
页码:10465 / 10474
页数:10
相关论文
共 53 条
[1]  
Angelova A, 2013, IEEE WORK APP COMP, P39, DOI 10.1109/WACV.2013.6474997
[2]  
[Anonymous], 2014, BMVC
[3]  
[Anonymous], 2016, MINING DISCRIMINATIV
[4]  
[Anonymous], 2017, Distilling a Neural Net
[5]  
[Anonymous], 2019, CVPR, DOI DOI 10.1109/CVPR.2019.00530
[6]   Higher-order Integration of Hierarchical Convolutional Activations for Fine-grained Visual Categorization [J].
Cai, Sijia ;
Zuo, Wangmeng ;
Zhang, Lei .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :511-520
[7]  
Cao Y., 2019, GCNet: non-local networks meet squeezeexcitation networks and beyond
[8]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[9]  
Chen TS, 2018, PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P627
[10]   Kernel Pooling for Convolutional Neural Networks [J].
Cui, Yin ;
Zhou, Feng ;
Wang, Jiang ;
Liu, Xiao ;
Lin, Yuanqing ;
Belongie, Serge .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :3049-3058