Contrastive Deep Supervision

被引:37
作者
Zhang, Linfeng [1 ]
Chen, Xin [2 ]
Zhang, Junbo [1 ]
Dong, Runpei [3 ]
Ma, Kaisheng [1 ]
机构
[1] Tsinghua Univ, Beijing, Peoples R China
[2] Intel Corp, Santa Clara, CA USA
[3] Xi An Jiao Tong Univ, Xian, Peoples R China
来源
COMPUTER VISION, ECCV 2022, PT XXVI | 2022年 / 13686卷
关键词
D O I
10.1007/978-3-031-19809-0_1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The success of deep learning is usually accompanied by the growth in neural network depth. However, the traditional training method only supervises the neural network at its last layer and propagates the supervision layer-by-layer, which leads to hardship in optimizing the intermediate layers. Recently, deep supervision has been proposed to add auxiliary classifiers to the intermediate layers of deep neural networks. By optimizing these auxiliary classifiers with the supervised task loss, the supervision can be applied to the shallow layers directly. However, deep supervision conflicts with the well-known observation that the shallow layers learn low-level features instead of task-biased high-level semantic features. To address this issue, this paper proposes a novel training framework named Contrastive Deep Supervision, which supervises the intermediate layers with augmentation-based contrastive learning. Experimental results on nine popular datasets with elevenmodels demonstrate its effects on general image classification, fine-grained image classification and object detection in supervised learning, semi-supervised learning and knowledge distillation. Codes have been released in Github.
引用
收藏
页码:1 / 19
页数:19
相关论文
共 83 条
[1]   Variational Information Distillation for Knowledge Transfer [J].
Ahn, Sungsoo ;
Hu, Shell Xu ;
Damianou, Andreas ;
Lawrence, Neil D. ;
Dai, Zhenwen .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :9155-9163
[2]  
[Anonymous], 2006, P ACM SIGKDD INT C K
[3]  
[Anonymous], 2017, INT C LEARN REPR ICL
[4]  
Chaitanya K., 2020, Advances in Neural Information Processing Systems, P12546, DOI DOI 10.48550/ARXIV.2006.10511
[5]  
Chen K, 2019, Arxiv, DOI [arXiv:1906.07155, 10.48550/arXiv.1906.07155, DOI 10.48550/ARXIV.1906.07155]
[6]   Wasserstein Contrastive Representation Distillation [J].
Chen, Liqun ;
Wang, Dong ;
Gan, Zhe ;
Liu, Jingjing ;
Henao, Ricardo ;
Carin, Lawrence .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :16291-16300
[7]  
Chen T., 2020, Advances in neural information processing systems, P22243
[8]  
Chen T, 2020, PR MACH LEARN RES, V119
[9]  
Chen XL, 2020, Arxiv, DOI arXiv:2003.04297
[10]   Exploring Simple Siamese Representation Learning [J].
Chen, Xinlei ;
He, Kaiming .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :15745-15753