Complementary Mask Self-Supervised Pre-training Based on Teacher-Student Network

被引：0

作者：

Ye, Shaoxiong ^{[1
]}

Huang, Jing ^{[1
]}

Zhu, Lifu ^{[1
]}

机构：

[1] Wuhan Univ Technol, Sch Comp Sci & Technol, Wuhan, Hubei, Peoples R China

来源：

2023 3RD ASIA-PACIFIC CONFERENCE ON COMMUNICATIONS TECHNOLOGY AND COMPUTER SCIENCE, ACCTCS | 2023年

关键词：

Pre-training model; Self-supervised; Masked image modeling; Contrastive learning; Encoder;

D O I：

10.1109/ACCTCS58815.2023.00082

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we propose a complementary self-supervised mask model based on teacher-student networks. This model contains a student network, a teacher network, and a mask prediction module. The student's network is an encoder structure, and the teacher's network consists of encoders and decoders. The teacher and student network encoders are used for learning image representations and have the same network structure and model parameters. The pre-training has two pre-tasks: First, the mask image block representation predicted by the decoder in the teacher network predicts the actual image pixels through the mask prediction module. Then, we introduce a comparative learning loss function to compare the outputs of the teacher and student modules in representation space. This paper proposes a complementary masking mechanism to reduce the gap between upstream and downstream mismatches in the pre-training model based on mask image modeling (MIM). For example, a complete picture is an input into the teacher and the student network. For the teacher network, the input picture is randomly masked off, for example, 75 %; the student network masks the remaining part of the input picture, 25 %. The student network masks the rest (25%) of the input image. The pre-trained model proposed in this paper has been pre-trained on COCO and other data sets, and downstream tasks are performed on four conventional data sets. By comparing with some of the latest self-supervised pre-trained models, it is proved that the pre-trained model proposed in this paper can learn better representational information.

引用

页码：199 / 206

页数：8

共 50 条

[41] Stabilizing Label Assignment for Speech Separation by Self-supervised Pre-training
Huang, Sung-Feng
Chuang, Shun-Po
Liu, Da-Rong
Chen, Yi-Chen
Yang, Gene-Ping
Lee, Hung-yi
INTERSPEECH 2021, 2021, : 3056 - 3060
[42] Self-Supervised Pre-training for Protein Embeddings Using Tertiary Structures
Guo, Yuzhi
Wu, Jiaxiang
Ma, Hehuan
Huang, Junzhou
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 6801 - 6809
[43] Progressive self-supervised learning: A pre-training method for crowd counting
Gu, Yao
Zheng, Zhe
Wu, Yingna
Xie, Guangping
Ni, Na
PATTERN RECOGNITION LETTERS, 2025, 188 : 148 - 154
[44] SslTransT: Self-supervised pre-training visual object tracking with Transformers
Cai, Yannan
Tan, Ke
Wei, Zhenzhong
OPTICS COMMUNICATIONS, 2024, 557
[45] GUIDED CONTRASTIVE SELF-SUPERVISED PRE-TRAINING FOR AUTOMATIC SPEECH RECOGNITION
Khare, Aparna
Wu, Minhua
Bhati, Saurabhchand
Droppo, Jasha
Maas, Roland
2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 174 - 181
[46] Class incremental learning with self-supervised pre-training and prototype learning
Liu, Wenzhuo
Wu, Xin-Jian
Zhu, Fei
Yu, Ming-Ming
Wang, Chuang
Liu, Cheng-Lin
PATTERN RECOGNITION, 2025, 157
[47] Deep learning based on self-supervised pre-training: Application on sandstone content prediction
Wang, Chong Ming
Wang, Xing Jian
Chen, Yang
Wen, Xue Mei
Zhang, Yong Heng
Li, Qing Wu
FRONTIERS IN EARTH SCIENCE, 2023, 10
[48] Abdominal Organs and Pan-Cancer Segmentation Based on Self-supervised Pre-training and Self-training
Li, He
Han, Meng
Wang, Guotai
FAST, LOW-RESOURCE, AND ACCURATE ORGAN AND PAN-CANCER SEGMENTATION IN ABDOMEN CT, FLARE 2023, 2024, 14544 : 130 - 142
[49] DenseCL: A simple framework for self-supervised dense visual pre-training
Wang, Xinlong
Zhang, Rufeng
Shen, Chunhua
Kong, Tao
VISUAL INFORMATICS, 2023, 7 (01) : 30 - 40
[50] Masked Autoencoder for Self-Supervised Pre-training on Lidar Point Clouds
Hess, Georg
Jaxing, Johan
Svensson, Elias
Hagerman, David
Petersson, Christoffer
Svensson, Lennart
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS (WACVW), 2023, : 350 - 359

← 1 2 3 4 5 →