Multi-Granularity Sparse Relationship Matrix Prediction Network for End-to-End Scene Graph Generation

被引:0
作者
Wang, Lei [1 ]
Yuan, Zejian [1 ]
Chen, Badong [1 ]
机构
[1] Xi An Jiao Tong Univ, Inst Artificial Intelligence & Robot, Xian 710049, Peoples R China
来源
COMPUTER VISION-ECCV 2024, PT LXXXII | 2025年 / 15140卷
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Scene Graph Generation; End-to-End; Sparse Relationship Matrix; Multi-Granularity;
D O I
10.1007/978-3-031-73007-8_7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Current end-to-end Scene Graph Generation (SGG) relies solely on visual representations to separately detect sparse relations and entities in an image. This leads to the issue where the predictions of entities do not contribute to the prediction of relations, necessitating post-processing to assign corresponding subjects and objects to the predicted relations. In this paper, we introduce a sparse relationship matrix that bridges entity detection and relation detection. Our approach not only eliminates the need for relation matching, but also leverages the semantics and positional information of predicted entities to enhance relation prediction. Specifically, a multi-granularity sparse relationship matrix prediction network is proposed, which utilizes three gated pooling modules focusing on filtering negative samples at different granularities, thereby obtaining a sparse relationship matrix containing entity pairs most likely to form relations. Finally, a set of sparse, most probable subject-object pairs can be constructed and used for relation decoding. Experimental results on multiple datasets demonstrate that our method achieves a new state-of-the-art overall performance. Our code is available at https://github.com/wanglei0618/Mg-RMPN.
引用
收藏
页码:105 / 121
页数:17
相关论文
共 40 条
  • [1] End-to-End Disparity Estimation with Multi-granularity Fully Convolutional Network
    Yang, Guorun
    Deng, Zhidong
    NEURAL INFORMATION PROCESSING (ICONIP 2017), PT III, 2017, 10636 : 238 - 248
  • [2] A Novel End-to-End Transformer for Scene Graph Generation
    Ren, Chengkai
    Liu, Xiuhua
    Cao, Mengyuan
    Zhang, Jian
    Wang, Hongwei
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [3] SGTR plus : End-to-End Scene Graph Generation With Transformer
    Li, Rongjie
    Zhang, Songyang
    He, Xuming
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (04) : 2191 - 2205
  • [4] Multi-Granularity Sequence Alignment Mapping for Encoder-Decoder Based End-to-End ASR
    Tang, Jian
    Zhang, Jie
    Song, Yan
    McLoughlin, Ian
    Dai, Li-Rong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2816 - 2828
  • [5] Multi-Granularity Contrastive Cross-Modal Collaborative Generation for End-to-End Long-Term Video Question Answering
    Yu, Ting
    Fu, Kunhao
    Zhang, Jian
    Huang, Qingming
    Yu, Jun
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 3115 - 3129
  • [6] Granular3D: Delving into multi-granularity 3D scene graph prediction
    Huang, Kaixiang
    Yang, Jingru
    Wang, Jin
    He, Shengfeng
    Wang, Zhan
    He, Haiyan
    Zhang, Qifeng
    Lu, Guodong
    PATTERN RECOGNITION, 2024, 153
  • [7] End-to-end event factuality prediction using directional labeled graph recurrent network
    Liu, Xiao
    Huang, Heyan
    Zhang, Yue
    INFORMATION PROCESSING & MANAGEMENT, 2022, 59 (02)
  • [8] Multi-granularity scenarios understanding network for trajectory prediction
    Biao Yang
    Jicheng Yang
    Rongrong Ni
    Changchun Yang
    Xiaofeng Liu
    Complex & Intelligent Systems, 2023, 9 : 851 - 864
  • [9] Multi-granularity scenarios understanding network for trajectory prediction
    Yang, Biao
    Yang, Jicheng
    Ni, Rongrong
    Yang, Changchun
    Liu, Xiaofeng
    COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (01) : 851 - 864
  • [10] Multi-granularity spatial temporal graph convolution network with consecutive attention for human motion prediction
    Ma, Jinli
    Zhang, Yumei
    Zhou, Hanghang
    Yang, Honghong
    Wu, Xiaojun
    APPLIED SOFT COMPUTING, 2024, 165