Unbiased Scene Graph Generation Using Predicate Similarities

被引:0
|
作者
Matsui, Yusuke [1 ]
Ohashi, Misaki [1 ]
机构
[1] Univ Tokyo, Grad Sch Informat Sci & Technol, Dept Informat & Commun Engn, Bunkyo Ku, Tokyo 1138656, Japan
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Task analysis; Knowledge transfer; Feature extraction; Visualization; Training; Computer vision; Transfer learning; Bioinformatics; Genomics; Classification algorithms; Scene classification; Scene graph; unbiased generation; predicate similarities; transfer learning; long-tailed distribution; SMOTE;
D O I
10.1109/ACCESS.2024.3424230
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Scene Graphs are widely applied in computer vision as a graphical representation of relationships between objects shown in images. However, these applications have not yet reached a practical stage of development owing to biased training caused by long-tailed predicate distributions. In recent years, many studies have tackled this problem. In contrast, relatively few works have considered predicate similarities as a unique dataset feature which also leads to the biased prediction. Due to the feature, infrequent predicates (e.g., "parked on", "covered in") are easily misclassified as closely-related frequent predicates (e.g., "on", "in"). Utilizing predicate similarities, we propose a new classification scheme that branches the process to several fine-grained classifiers for similar predicate groups. The classifiers aim to capture the differences among similar predicates in detail. We also introduce the idea of transfer learning to enhance the features for the predicates which lack sufficient training samples to learn the descriptive representations. Our target here is to improve the average precision scores even for the instances with the tail predicators. The results of extensive experiments on the Visual Genome dataset show that the combination of our method and an existing debiasing approach greatly improves performance on tail predicates in challenging SGCls/SGDet tasks. Nonetheless, the overall performance of the proposed approach does not reach that of the current state of the art, so further analysis remains necessary as future work.
引用
收藏
页码:95507 / 95516
页数:10
相关论文
共 50 条
  • [41] 3D Scene Graph Generation From Point Clouds
    Wei, Wenwen
    Wei, Ping
    Qin, Jialu
    Liao, Zhimin
    Wang, Shuaijie
    Cheng, Xiang
    Liu, Meiqin
    Zheng, Nanning
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 5358 - 5368
  • [42] Scene Graph Masked Variational Autoencoders for 3D Scene Generation
    Xu, Rui
    Hui, Le
    Han, Yuehui
    Qian, Jianjun
    Xie, Jin
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 5725 - 5733
  • [43] Transformer networks with adaptive inference for scene graph generation
    Wang, Yini
    Gao, Yongbin
    Yu, Wenjun
    Guo, Ruyan
    Wan, Weibing
    Yang, Shuqun
    Huang, Bo
    APPLIED INTELLIGENCE, 2023, 53 (08) : 9621 - 9633
  • [44] Transformer networks with adaptive inference for scene graph generation
    Yini Wang
    Yongbin Gao
    Wenjun Yu
    Ruyan Guo
    Weibing Wan
    Shuqun Yang
    Bo Huang
    Applied Intelligence, 2023, 53 : 9621 - 9633
  • [45] Image Generation from Scene Graph with Object Edges
    Li, Chenxing
    Duan, Yiping
    Du, Qiyuan
    Pan, Chengkang
    Liu, Guangyi
    Tao, Xiaoming
    2022 IEEE 96TH VEHICULAR TECHNOLOGY CONFERENCE (VTC2022-FALL), 2022,
  • [46] Scene graph generation with award-punishment strategy
    Gao, Haiyan
    Shi, Dibo
    Jiang, Tianling
    Li, Xin
    Zhang, Zefan
    Ji, Yi
    Li, Ying
    Liu, Chunping
    KNOWLEDGE-BASED SYSTEMS, 2022, 251
  • [47] Image-Collection Summarization Using Scene-Graph Generation With External Knowledge
    Phueaksri, Itthisak
    Kastner, Marc A.
    Kawanishi, Yasutomo
    Komamizu, Takahiro
    Ide, Ichiro
    IEEE ACCESS, 2024, 12 : 17499 - 17512
  • [48] Toward a Unified Transformer-Based Framework for Scene Graph Generation and Human-Object Interaction Detection
    He, Tao
    Gao, Lianli
    Song, Jingkuan
    Li, Yuan-Fang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 6274 - 6288
  • [49] Expressive Scene Graph Generation Using Commonsense Knowledge Infusion for Visual Understanding and Reasoning
    Khan, Muhammad Jaleed
    Breslin, John G.
    Curry, Edward
    SEMANTIC WEB, ESWC 2022, 2022, 13261 : 93 - 112
  • [50] MGN-Net: Multigranularity Graph Fusion Network in Multimodal for Scene Text Spotting
    Yuan, Zhengyi
    Shi, Cao
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (14): : 25088 - 25098