Dense Semantic Contrast for Self-Supervised Visual Representation Learning

被引：21

作者：

Li, Xiaoni ^{[1
,2
]}

Zhou, Yu ^{[1
,2
]}

Zhang, Yifei ^{[1
,2
]}

Zhang, Aoting ^{[1
]}

Wang, Wei ^{[1
,2
]}

Jiang, Ning ^{[3
]}

Wu, Haiying ^{[3
]}

Wang, Weiping ^{[1
]}

机构：

[1] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Sch Cyber Secur, Beijing, Peoples R China

[3] Mashang Consumer Finance Co Ltd, Beijing, Peoples R China

来源：

PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021 | 2021年

基金：

中国国家自然科学基金;

关键词：

Self-Supervised Learning; Representation Learning; Contrastive; Learning; Dense Representation; Semantics Discovery;

D O I：

10.1145/3474085.3475551

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Self-supervised representation learning for visual pre-training has achieved remarkable success with sample (instance or pixel) discrimination and semantics discovery of instance, whereas there still exists a non-negligible gap between pre-trained model and downstream dense prediction tasks. Concretely, these downstream tasks require more accurate representation, in other words, the pixels from the same object must belong to a shared semantic category, which is lacking in the previous methods. In this work, we present Dense Semantic Contrast (DSC) for modeling semantic category decision boundaries at a dense level to meet the requirement of these tasks. Furthermore, we propose a dense cross-image semantic contrastive learning framework for multi-granularity representation learning. Specially, we explicitly explore the semantic structure of the dataset by mining relations among pixels from different perspectives. For intra-image relation modeling, we discover pixel neighbors from multiple views. And for inter-image relations, we enforce pixel representation from the same semantic class to be more similar than the representation from different classes in one mini-batch. Experimental results show that our DSC model outperforms state-of-the-art methods when transferring to downstream dense prediction tasks, including object detection, semantic segmentation, and instance segmentation. Code will be made available.

引用

页码：1368 / 1376

页数：9

共 50 条

[21] Dense Contrastive Learning for Self-Supervised Visual Pre-Training
Wang, Xinlong
Zhang, Rufeng
Shen, Chunhua
Kong, Tao
Li, Lei
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 3023 - 3032
[22] Audio-Visual Predictive Coding for Self-Supervised Visual Representation Learning
Tellamekala, Mani Kumar
Valstar, Michel
Pound, Michael
Giesbrecht, Timo
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 9912 - 9919
[23] Boost Supervised Pretraining for Visual Transfer Learning: Implications of Self-Supervised Contrastive Representation Learning
Sun, Jinghan
Wei, Dong
Ma, Kai
Wang, Liansheng
Zheng, Yefeng
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 2307 - 2315
[24] Self-supervised Learning for Semantic Sentence Matching with Dense Transformer Inference Network
Yu, Fengying
Wang, Jianzong
Tao, Dewei
Cheng, Ning
Xiao, Jing
WEB AND BIG DATA, APWEB-WAIM 2021, PT I, 2021, 12858 : 258 - 272
[25] Comparing Learning Methodologies for Self-Supervised Audio-Visual Representation Learning
Terbouche, Hacene
Schoneveld, Liam
Benson, Oisin
Othmani, Alice
IEEE ACCESS, 2022, 10 : 41622 - 41638
[26] Whitening for Self-Supervised Representation Learning
Ermolov, Aleksandr
Siarohin, Aliaksandr
Sangineto, Enver
Sebe, Nicu
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[27] Self-Supervised Representation Learning for CAD
Jones, Benjamin T.
Hu, Michael
Kodnongbua, Milin
Kim, Vladimir G.
Schulz, Adriana
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 21327 - 21336
[28] Enhancing motion visual cues for self-supervised video representation learning
Nie, Mu
Quan, Zhibin
Ding, Weiping
Yang, Wankou
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 123
[29] Self-supervised Learning of Implicit Shape Representation with Dense Correspondence for Deformable Objects
Zhang, Baowen
Li, Jiahe
Deng, Xiaoming
Zhang, Yinda
Ma, Cuixia
Wang, Hongan
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 14222 - 14232
[30] MULTI-AUGMENTATION FOR EFFICIENT SELF-SUPERVISED VISUAL REPRESENTATION LEARNING
Tran, Van Nhiem
Huang, Chi-En
Liu, Shen-Hsuan
Yang, Kai-Lin
Ko, Timothy
Li, Yung-Hui
2022 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (IEEE ICMEW 2022), 2022,

← 1 2 3 4 5 →