Boosting Relationship Detection in Images with Multi-Granular Self-Supervised Learning

被引：1

作者：

Ding, Xuewei ^{[1
]}

Pan, Yingwei ^{[2
]}

Li, Yehao ^{[2
]}

Yao, Ting ^{[2
]}

Zeng, Dan ^{[1
]}

Mei, Tao ^{[2
]}

机构：

[1] Shanghai Univ, 99 ShangDa Rd, Shanghai 200444, Peoples R China

[2] AI Res, 8 Beichen West St, Beijing 100105, Peoples R China

来源：

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS | 2023年 / 19卷 / 02期

基金：

国家重点研发计划;

关键词：

Visual relationship detection; self-supervised learning;

D O I：

10.1145/3556978

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Visual and spatial relationship detection in images has been a fast-developing research topic in the multimedia field, which learns to recognize the semantic/spatial interactions between objects in an image, aiming to compose a structured semantic understanding of the scene. Most of the existing techniques directly encapsulate the holistic image feature plus the semantic and spatial features of the given two objects for predicting the relationship, but leave the inherent supervision derived from such structured and thorough image understanding under-exploited. Specifically, the inherent supervision among objects or relations within an image can span different granularities in this hierarchy including, from simple to comprehensive, (1) the object-based supervision that captures the interaction between the semantic and spatial features of each individual object, (2) the inter-object supervision that characterizes the dependency within the relationship triplet (<subject-predicate-object>), and (3) the inter-relation supervision that exploits contextual information among all relationship triplets in an image. These inherent multi-granular supervisions offer a fertile ground for building self-supervised proxy tasks. In this article, we compose a trilogy of exploring the multi-granular supervision in the sequence from object-based, inter-object, and inter-relation perspectives. We integrate the standard relationship detection objective with a series of proposed self-supervised proxy tasks, which is named as Multi-Granular Self-Supervised learning (MGS). Our MGS is appealing in view that it is pluggable to any neural relationship detection models by simply including the proxy tasks during training, without increasing the computational cost at inference. Through extensive experiments conducted on the SpatialSense and VRD datasets, we demonstrate the superiority of MGS for both spatial and visual relationship detection tasks.

引用

页数：18

共 50 条

[1] Ship Detection in Sentinel 2 Multi-Spectral Images with Self-Supervised Learning
Ciocarlan, Alina
Stoian, Andrei
REMOTE SENSING, 2021, 13 (21)
[2] Multi-Label Self-Supervised Learning with Scene Images
Zhu, Ke
Fu, Minghao
Wu, Jianxin
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 6671 - 6680
[3] Multi-Task Self-Supervised Learning for Disfluency Detection
Wang, Shaolei
Che, Wanxiang
Liu, Qi
Qin, Pengda
Liu, Ting
Wang, William Yang
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9193 - 9200
[4] Boosting Self-Supervised Learning via Knowledge Transfer
Noroozi, Mehdi
Vinjimoor, Ananth
Favaro, Paolo
Pirsiavash, Hamed
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 9359 - 9367
[5] A MULTI-TASK SELF-SUPERVISED LEARNING FRAMEWORK FOR SCOPY IMAGES
Li, Yuexiang
Chen, Jiawei
Zheng, Yefeng
2020 IEEE 17TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2020), 2020, : 2005 - 2009
[6] Self-supervised learning enhancement and detection methods for nocturnal animal images
Wang, Chi
Shen, Chen
Huang, Qing
Zhang, Guo-feng
Lu, Han
Chen, Jin-bo
CHINESE OPTICS, 2024, 17 (05) : 1087 - 1097
[7] Self-supervised learning for hotspot detection and isolation from thermal images
Goyal, Shreyas
Rajapakse, Jagath C.
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237
[8] Self-supervised learning for outlier detection
Diers, Jan
Pigorsch, Christian
STAT, 2021, 10 (01):
[9] Self-supervised Representation Learning on Document Images
Cosma, Adrian
Ghidoveanu, Mihai
Panaitescu-Liess, Michael
Popescu, Marius
DOCUMENT ANALYSIS SYSTEMS, 2020, 12116 : 103 - 117
[10] Self-supervised contrastive learning on agricultural images
Guldenring, Ronja
Nalpantidis, Lazaros
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2021, 191

← 1 2 3 4 5 →