RepSGG: Novel Representations of Entities and Relationships for Scene Graph Generation

被引:3
|
作者
Liu, Hengyue [1 ]
Bhanu, Bir [1 ]
机构
[1] Univ Calif Riverside, Dept Elect & Comp Engn, Riverside, CA 92521 USA
基金
美国国家科学基金会;
关键词
Feature extraction; Visualization; Semantics; Task analysis; Detectors; Shape; Training; Scene graph generation; visual relationship detection; long-tailed learning; human-Object interaction;
D O I
10.1109/TPAMI.2024.3402143
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scene Graph Generation (SGG) has achieved significant progress recently. However, most previous works rely heavily on fixed-size entity representations based on bounding box proposals, anchors, or learnable queries. As each representation's cardinality has different trade-offs between performance and computation overhead, extracting highly representative features efficiently and dynamically is both challenging and crucial for SGG. In this work, a novel architecture called RepSGG is proposed to address the aforementioned challenges, formulating a subject as queries, an object as keys, and their relationship as the maximum attention weight between pairwise queries and keys. With more fine-grained and flexible representation power for entities and relationships, RepSGG learns to sample semantically discriminative and representative points for relationship inference. Moreover, the long-tailed distribution also poses a significant challenge for generalization of SGG. A run-time performance-guided logit adjustment (PGLA) strategy is proposed such that the relationship logits are modified via affine transformations based on run-time performance during training. This strategy encourages a more balanced performance between dominant and rare classes. Experimental results show that RepSGG achieves the state-of-the-art or comparable performance on the Visual Genome and Open Images V6 datasets with fast inference speed, demonstrating the efficacy and efficiency of the proposed methods.
引用
收藏
页码:8018 / 8035
页数:18
相关论文
共 50 条
  • [41] Heterogeneous Learning for Scene Graph Generation
    He, Yunqing
    Ren, Tongwei
    Tang, Jinhui
    Wu, Gangshan
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4704 - 4713
  • [42] Attribute Prototype-Guided Iterative Scene Graph for Explainable Radiology Report Generation
    Zhang, Ke
    Yang, Yan
    Yu, Jun
    Fan, Jianping
    Jiang, Hanliang
    Huang, Qingming
    Han, Weidong
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2024, 43 (12) : 4470 - 4482
  • [43] Image-Collection Summarization Using Scene-Graph Generation With External Knowledge
    Phueaksri, Itthisak
    Kastner, Marc A.
    Kawanishi, Yasutomo
    Komamizu, Takahiro
    Ide, Ichiro
    IEEE ACCESS, 2024, 12 : 17499 - 17512
  • [44] SRSG and S2SG: A Model and a Dataset for Scene Graph Generation of Remote Sensing Images From Segmentation Results
    Lin, Zhiyuan
    Zhu, Feng
    Kong, Yanzi
    Wang, Qun
    Wang, Jianyu
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [45] Atom correlation based graph propagation for scene graph generation
    Lin, Bingqian
    Zhu, Yi
    Liang, Xiaodan
    PATTERN RECOGNITION, 2022, 122
  • [46] Dynamic Gated Graph Neural Networks for Scene Graph Generation
    Khademi, Mahmoud
    Schulte, Oliver
    COMPUTER VISION - ACCV 2018, PT VI, 2019, 11366 : 669 - 685
  • [47] DBiased-P: Dual-Biased Predicate Predictor for Unbiased Scene Graph Generation
    Han, Xianjing
    Song, Xuemeng
    Dong, Xingning
    Wei, Yinwei
    Liu, Meng
    Nie, Liqiang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 5319 - 5329
  • [48] Uncertainty-Aware Scene Graph Generation
    Li, Xuewei
    Wu, Tao
    Zheng, Guangcong
    Yu, Yunlong
    Li, Xi
    PATTERN RECOGNITION LETTERS, 2023, 167 : 30 - 37
  • [49] One-shot Scene Graph Generation
    Guo, Yuyu
    Song, Jingkuan
    Gao, Lianli
    Shen, Heng Tao
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 3090 - 3098
  • [50] A Novel Approach to Scene Graph Vectorization
    Kumar, Vinod
    Aggarwal, Deepanshu
    Bathwal, Vinamra
    Singh, Saurabh
    2021 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION, AND INTELLIGENT SYSTEMS (ICCCIS), 2021, : 696 - 701