Unbiased Scene Graph Generation Using Predicate Similarities

被引:0
|
作者
Matsui, Yusuke [1 ]
Ohashi, Misaki [1 ]
机构
[1] Univ Tokyo, Grad Sch Informat Sci & Technol, Dept Informat & Commun Engn, Bunkyo Ku, Tokyo 1138656, Japan
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Task analysis; Knowledge transfer; Feature extraction; Visualization; Training; Computer vision; Transfer learning; Bioinformatics; Genomics; Classification algorithms; Scene classification; Scene graph; unbiased generation; predicate similarities; transfer learning; long-tailed distribution; SMOTE;
D O I
10.1109/ACCESS.2024.3424230
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Scene Graphs are widely applied in computer vision as a graphical representation of relationships between objects shown in images. However, these applications have not yet reached a practical stage of development owing to biased training caused by long-tailed predicate distributions. In recent years, many studies have tackled this problem. In contrast, relatively few works have considered predicate similarities as a unique dataset feature which also leads to the biased prediction. Due to the feature, infrequent predicates (e.g., "parked on", "covered in") are easily misclassified as closely-related frequent predicates (e.g., "on", "in"). Utilizing predicate similarities, we propose a new classification scheme that branches the process to several fine-grained classifiers for similar predicate groups. The classifiers aim to capture the differences among similar predicates in detail. We also introduce the idea of transfer learning to enhance the features for the predicates which lack sufficient training samples to learn the descriptive representations. Our target here is to improve the average precision scores even for the instances with the tail predicators. The results of extensive experiments on the Visual Genome dataset show that the combination of our method and an existing debiasing approach greatly improves performance on tail predicates in challenging SGCls/SGDet tasks. Nonetheless, the overall performance of the proposed approach does not reach that of the current state of the art, so further analysis remains necessary as future work.
引用
收藏
页码:95507 / 95516
页数:10
相关论文
共 50 条
  • [21] Neural Belief Propagation for Scene Graph Generation
    Liu, Daqi
    Bober, Miroslaw
    Kittler, Josef
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (08) : 10161 - 10172
  • [22] MLMG-SGG: Multilabel Scene Graph Generation With Multigrained Features
    Li, Xuewei
    Miao, Peihan
    Li, Songyuan
    Li, Xi
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 1549 - 1559
  • [23] Explore Contextual Information for 3D Scene Graph Generation
    Liu, Yuanyuan
    Long, Chengjiang
    Zhang, Zhaoxuan
    Liu, Bokai
    Zhang, Qiang
    Yin, Baocai
    Yang, Xin
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2023, 29 (12) : 5556 - 5568
  • [24] Toward Region-Aware Attention Learning for Scene Graph Generation
    Liu, An-An
    Tian, Hongshuo
    Xu, Ning
    Nie, Weizhi
    Zhang, Yongdong
    Kankanhalli, Mohan
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (12) : 7655 - 7666
  • [25] NICEST: Noisy Label Correction and Training for Robust Scene Graph Generation
    Li, Lin
    Xiao, Jun
    Shi, Hanrong
    Zhang, Hanwang
    Yang, Yi
    Liu, Wei
    Chen, Long
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (10) : 6873 - 6888
  • [26] Divide and Conquer: Subset Matching for Scene Graph Generation in Complex Scenes
    Lin, Xin
    Zeng, Jinquan
    Li, Xingquan
    IEEE ACCESS, 2022, 10 : 39069 - 39079
  • [27] Pair Then Relation: Pair-Net for Panoptic Scene Graph Generation
    Wang, Jinghao
    Wen, Zhengyu
    Li, Xiangtai
    Guo, Zujin
    Yang, Jingkang
    Liu, Ziwei
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (12) : 10452 - 10465
  • [28] BiFormer for Scene Graph Generation Based on VisionNet With Taylor Hiking Optimization Algorithm
    Monesh, S.
    Senthilkumar, N. C.
    IEEE ACCESS, 2025, 13 : 57207 - 57222
  • [29] Spatial–Temporal Knowledge-Embedded Transformer for Video Scene Graph Generation
    Pu, Tao
    Chen, Tianshui
    Wu, Hefeng
    Lu, Yongyi
    Lin, Liang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 556 - 568
  • [30] SGG-MVAR: Cross-Modal Retrieval With Scene Graph Generation and Multiview Attribute Relationship Guidance
    Wang, Suping
    Zhou, Fei
    Yang, Ming
    Shi, Lei
    Tan, Chaohong
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2025,