Learning to Segment the Tail

被引:33
作者
Hu, Xinting [1 ]
Jiang, Yi [2 ]
Tang, Kaihua [1 ]
Chen, Jingyuan [3 ]
Miao, Chunyan [1 ]
Zhang, Hanwang [1 ]
机构
[1] Nanyang Technol Univ, Singapore, Singapore
[2] Alibaba Grp, Hangzhou, Zhejiang, Peoples R China
[3] Alibaba Grp, Damo Acad, Hangzhou, Zhejiang, Peoples R China
来源
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020) | 2020年
关键词
D O I
10.1109/CVPR42600.2020.01406
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Real-world visual recognition requires handling the extreme sample imbalance in large-scale long-tailed data. We propose a "divide&conquer" strategy for the challenging LVIS task: divide the whole data into balanced parts and then apply incremental learning to conquer each one. This derives a novel learning paradigm: class-incremental fewshot learning, which is especially effective for the challenge evolving over time: 1) the class imbalance among the oldclass knowledge review and 2) the few-shot data in new class learning. We call our approach Learning to Segment the Tail (LST). In particular, we design an instance-level balanced replay scheme, which is a memory-efficient approximation to balance the instance-level samples from the old-class images. We also propose to use a meta-module for new-class learning, where the module parameters are shared across incremental phases, gaining the learning-to learn knowledge incrementally, from the data-rich head to the data-poor tail. We empirically show that: at the expense of a little sacrifice of head-class forgetting, we can gain a significant 8.3% AP improvement for the tail classes with less than 10 instances, achieving an overall 2.0% AP boost for the whole 1,230 classes'.
引用
收藏
页码:14042 / 14051
页数:10
相关论文
共 48 条
[1]  
[Anonymous], 2020, CVPR, DOI DOI 10.1109/CVPR42600.2020.00331
[2]  
[Anonymous], 2006, AAAI
[3]  
[Anonymous], 2017, Neurips
[4]   End-to-End Incremental Learning [J].
Castro, Francisco M. ;
Marin-Jimenez, Manuel J. ;
Guil, Nicolas ;
Schmid, Cordelia ;
Alahari, Karteek .
COMPUTER VISION - ECCV 2018, PT XII, 2018, 11216 :241-257
[5]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[6]   Disruption of Cortical Dopaminergic Modulation Impairs Preparatory Activity and Delays Licking Initiation [J].
Chen, Ke ;
Vincis, Roberto ;
Fontanini, Alfredo .
CEREBRAL CORTEX, 2019, 29 (04) :1802-1815
[7]   MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features [J].
Chen, Liang-Chieh ;
Hermans, Alexander ;
Papandreou, George ;
Schroff, Florian ;
Wang, Peng ;
Adam, Hartwig .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4013-4022
[8]  
Cui Y., 2019, P IEEECVF C COMPUTER, P9268
[9]   Class Rectification Hard Mining for Imbalanced Deep Learning [J].
Dong, Qi ;
Gong, Shaogang ;
Zhu, Xiatian .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :1869-1878
[10]  
Fetaya Ethan, 2019, NEURIPS