Unbiased Scene Graph Generation Using Predicate Similarities

被引：0

作者：

Matsui, Yusuke ^{[1
]}

Ohashi, Misaki ^{[1
]}

机构：

[1] Univ Tokyo, Grad Sch Informat Sci & Technol, Dept Informat & Commun Engn, Bunkyo Ku, Tokyo 1138656, Japan

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Task analysis; Knowledge transfer; Feature extraction; Visualization; Training; Computer vision; Transfer learning; Bioinformatics; Genomics; Classification algorithms; Scene classification; Scene graph; unbiased generation; predicate similarities; transfer learning; long-tailed distribution; SMOTE;

D O I：

10.1109/ACCESS.2024.3424230

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Scene Graphs are widely applied in computer vision as a graphical representation of relationships between objects shown in images. However, these applications have not yet reached a practical stage of development owing to biased training caused by long-tailed predicate distributions. In recent years, many studies have tackled this problem. In contrast, relatively few works have considered predicate similarities as a unique dataset feature which also leads to the biased prediction. Due to the feature, infrequent predicates (e.g., "parked on", "covered in") are easily misclassified as closely-related frequent predicates (e.g., "on", "in"). Utilizing predicate similarities, we propose a new classification scheme that branches the process to several fine-grained classifiers for similar predicate groups. The classifiers aim to capture the differences among similar predicates in detail. We also introduce the idea of transfer learning to enhance the features for the predicates which lack sufficient training samples to learn the descriptive representations. Our target here is to improve the average precision scores even for the instances with the tail predicators. The results of extensive experiments on the Visual Genome dataset show that the combination of our method and an existing debiasing approach greatly improves performance on tail predicates in challenging SGCls/SGDet tasks. Nonetheless, the overall performance of the proposed approach does not reach that of the current state of the art, so further analysis remains necessary as future work.

引用

页码：95507 / 95516

页数：10

共 50 条

[21] Neural Belief Propagation for Scene Graph Generation
Liu, Daqi
Bober, Miroslaw
Kittler, Josef
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (08) : 10161 - 10172
[22] MLMG-SGG: Multilabel Scene Graph Generation With Multigrained Features
Li, Xuewei
Miao, Peihan
Li, Songyuan
Li, Xi
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 1549 - 1559
[23] Explore Contextual Information for 3D Scene Graph Generation
Liu, Yuanyuan
Long, Chengjiang
Zhang, Zhaoxuan
Liu, Bokai
Zhang, Qiang
Yin, Baocai
Yang, Xin
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2023, 29 (12) : 5556 - 5568
[24] Toward Region-Aware Attention Learning for Scene Graph Generation
Liu, An-An
Tian, Hongshuo
Xu, Ning
Nie, Weizhi
Zhang, Yongdong
Kankanhalli, Mohan
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (12) : 7655 - 7666
[25] NICEST: Noisy Label Correction and Training for Robust Scene Graph Generation
Li, Lin
Xiao, Jun
Shi, Hanrong
Zhang, Hanwang
Yang, Yi
Liu, Wei
Chen, Long
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (10) : 6873 - 6888
[26] Divide and Conquer: Subset Matching for Scene Graph Generation in Complex Scenes
Lin, Xin
Zeng, Jinquan
Li, Xingquan
IEEE ACCESS, 2022, 10 : 39069 - 39079
[27] Pair Then Relation: Pair-Net for Panoptic Scene Graph Generation
Wang, Jinghao
Wen, Zhengyu
Li, Xiangtai
Guo, Zujin
Yang, Jingkang
Liu, Ziwei
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (12) : 10452 - 10465
[28] BiFormer for Scene Graph Generation Based on VisionNet With Taylor Hiking Optimization Algorithm
Monesh, S.
Senthilkumar, N. C.
IEEE ACCESS, 2025, 13 : 57207 - 57222
[29] Spatial–Temporal Knowledge-Embedded Transformer for Video Scene Graph Generation
Pu, Tao
Chen, Tianshui
Wu, Hefeng
Lu, Yongyi
Lin, Liang
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 556 - 568
[30] SGG-MVAR: Cross-Modal Retrieval With Scene Graph Generation and Multiview Attribute Relationship Guidance
Wang, Suping
Zhou, Fei
Yang, Ming
Shi, Lei
Tan, Chaohong
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2025,

← 1 2 3 4 5 →