Unbiased Scene Graph Generation Using Predicate Similarities

被引：0

作者：

Matsui, Yusuke ^{[1
]}

Ohashi, Misaki ^{[1
]}

机构：

[1] Univ Tokyo, Grad Sch Informat Sci & Technol, Dept Informat & Commun Engn, Bunkyo Ku, Tokyo 1138656, Japan

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Task analysis; Knowledge transfer; Feature extraction; Visualization; Training; Computer vision; Transfer learning; Bioinformatics; Genomics; Classification algorithms; Scene classification; Scene graph; unbiased generation; predicate similarities; transfer learning; long-tailed distribution; SMOTE;

D O I：

10.1109/ACCESS.2024.3424230

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Scene Graphs are widely applied in computer vision as a graphical representation of relationships between objects shown in images. However, these applications have not yet reached a practical stage of development owing to biased training caused by long-tailed predicate distributions. In recent years, many studies have tackled this problem. In contrast, relatively few works have considered predicate similarities as a unique dataset feature which also leads to the biased prediction. Due to the feature, infrequent predicates (e.g., "parked on", "covered in") are easily misclassified as closely-related frequent predicates (e.g., "on", "in"). Utilizing predicate similarities, we propose a new classification scheme that branches the process to several fine-grained classifiers for similar predicate groups. The classifiers aim to capture the differences among similar predicates in detail. We also introduce the idea of transfer learning to enhance the features for the predicates which lack sufficient training samples to learn the descriptive representations. Our target here is to improve the average precision scores even for the instances with the tail predicators. The results of extensive experiments on the Visual Genome dataset show that the combination of our method and an existing debiasing approach greatly improves performance on tail predicates in challenging SGCls/SGDet tasks. Nonetheless, the overall performance of the proposed approach does not reach that of the current state of the art, so further analysis remains necessary as future work.

引用

页码：95507 / 95516

页数：10

共 50 条

[41] 3D Scene Graph Generation From Point Clouds
Wei, Wenwen
Wei, Ping
Qin, Jialu
Liao, Zhimin
Wang, Shuaijie
Cheng, Xiang
Liu, Meiqin
Zheng, Nanning
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 5358 - 5368
[42] Scene Graph Masked Variational Autoencoders for 3D Scene Generation
Xu, Rui
Hui, Le
Han, Yuehui
Qian, Jianjun
Xie, Jin
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 5725 - 5733
[43] Transformer networks with adaptive inference for scene graph generation
Wang, Yini
Gao, Yongbin
Yu, Wenjun
Guo, Ruyan
Wan, Weibing
Yang, Shuqun
Huang, Bo
APPLIED INTELLIGENCE, 2023, 53 (08) : 9621 - 9633
[44] Transformer networks with adaptive inference for scene graph generation
Yini Wang
Yongbin Gao
Wenjun Yu
Ruyan Guo
Weibing Wan
Shuqun Yang
Bo Huang
Applied Intelligence, 2023, 53 : 9621 - 9633
[45] Image Generation from Scene Graph with Object Edges
Li, Chenxing
Duan, Yiping
Du, Qiyuan
Pan, Chengkang
Liu, Guangyi
Tao, Xiaoming
2022 IEEE 96TH VEHICULAR TECHNOLOGY CONFERENCE (VTC2022-FALL), 2022,
[46] Scene graph generation with award-punishment strategy
Gao, Haiyan
Shi, Dibo
Jiang, Tianling
Li, Xin
Zhang, Zefan
Ji, Yi
Li, Ying
Liu, Chunping
KNOWLEDGE-BASED SYSTEMS, 2022, 251
[47] Image-Collection Summarization Using Scene-Graph Generation With External Knowledge
Phueaksri, Itthisak
Kastner, Marc A.
Kawanishi, Yasutomo
Komamizu, Takahiro
Ide, Ichiro
IEEE ACCESS, 2024, 12 : 17499 - 17512
[48] Toward a Unified Transformer-Based Framework for Scene Graph Generation and Human-Object Interaction Detection
He, Tao
Gao, Lianli
Song, Jingkuan
Li, Yuan-Fang
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 6274 - 6288
[49] Expressive Scene Graph Generation Using Commonsense Knowledge Infusion for Visual Understanding and Reasoning
Khan, Muhammad Jaleed
Breslin, John G.
Curry, Edward
SEMANTIC WEB, ESWC 2022, 2022, 13261 : 93 - 112
[50] MGN-Net: Multigranularity Graph Fusion Network in Multimodal for Scene Text Spotting
Yuan, Zhengyi
Shi, Cao
IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (14): : 25088 - 25098

← 1 2 3 4 5 →