SGTR plus : End-to-End Scene Graph Generation With Transformer

被引:1
|
作者
Li, Rongjie [1 ]
Zhang, Songyang [1 ]
He, Xuming [1 ,2 ]
机构
[1] ShanghaiTech Univ, Shanghai 201210, Peoples R China
[2] Shanghai Engn Res Ctr Intelligent Vis & Imaging, Shanghai 201210, Peoples R China
关键词
Computer vision; deep learning; scene graph generation; scene understanding; visual relationship detection;
D O I
10.1109/TPAMI.2023.3332246
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scene Graph Generation (SGG) remains a challenging visual understanding task due to its compositional property. Most previous works adopt a bottom-up, two-stage or point-based, one-stage approach, which often suffers from high time complexity or suboptimal designs. In this paper, we propose a novel SGG method to address the aforementioned issues, formulating the task as a bipartite graph construction problem. To address the issues above, we create a transformer-based end-to-end framework to generate the entity and entity-aware predicate proposal set, and infer directed edges to form relation triplets. Moreover, we design a graph assembling module to infer the connectivity of the bipartite scene graph based on our entity-aware structure, enabling us to generate the scene graph in an end-to-end manner. Based on bipartite graph assembling paradigm, we further propose a new technical design to address the efficacy of entity-aware modeling and optimization stability of graph assembling. Equipped with the enhanced entity-aware design, our method achieves optimal performance and time-complexity. Extensive experimental results show that our design is able to achieve the state-of-the-art or comparable performance on three challenging benchmarks, surpassing most of the existing approaches and enjoying higher efficiency in inference.
引用
收藏
页码:2191 / 2205
页数:15
相关论文
共 50 条
  • [1] A Novel End-to-End Transformer for Scene Graph Generation
    Ren, Chengkai
    Liu, Xiuhua
    Cao, Mengyuan
    Zhang, Jian
    Wang, Hongwei
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [2] Learning Scene-Pedestrian Graph for End-to-End Person Search
    Song, Zifan
    Zhao, Cairong
    Hu, Guosheng
    Miao, Duoqian
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (02) : 2979 - 2990
  • [3] Multi-Granularity Sparse Relationship Matrix Prediction Network for End-to-End Scene Graph Generation
    Wang, Lei
    Yuan, Zejian
    Chen, Badong
    COMPUTER VISION-ECCV 2024, PT LXXXII, 2025, 15140 : 105 - 121
  • [4] TransVG plus plus : End-to-End Visual Grounding With Language Conditioned Vision Transformer
    Deng, Jiajun
    Yang, Zhengyuan
    Liu, Daqing
    Chen, Tianlang
    Zhou, Wengang
    Zhang, Yanyong
    Li, Houqiang
    Ouyang, Wanli
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (11) : 13636 - 13652
  • [5] RelTR: Relation Transformer for Scene Graph Generation
    Cong, Yuren
    Yang, Michael Ying
    Rosenhahn, Bodo
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (09) : 11169 - 11183
  • [6] Transformer networks with adaptive inference for scene graph generation
    Wang, Yini
    Gao, Yongbin
    Yu, Wenjun
    Guo, Ruyan
    Wan, Weibing
    Yang, Shuqun
    Huang, Bo
    APPLIED INTELLIGENCE, 2023, 53 (08) : 9621 - 9633
  • [7] Transformer networks with adaptive inference for scene graph generation
    Yini Wang
    Yongbin Gao
    Wenjun Yu
    Ruyan Guo
    Weibing Wan
    Shuqun Yang
    Bo Huang
    Applied Intelligence, 2023, 53 : 9621 - 9633
  • [8] V-DETR: Pure Transformer for End-to-End Object Detection
    Dung Nguyen
    Van-Dung Hoang
    Van-Tuong-Lan Le
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, PT II, ACIIDS 2024, 2024, 14796 : 120 - 131
  • [9] SRDD: a lightweight end-to-end object detection with transformer
    Zhu, Yuan
    Xia, Qingyuan
    Jin, Wen
    CONNECTION SCIENCE, 2022, 34 (01) : 2448 - 2465
  • [10] SDformer: Efficient End-to-End Transformer for Depth Completion
    Qian, Jian
    Sun, Miao
    Lee, Ashley
    Li, Jie
    Zhuo, Shenglong
    Chiang, Patrick Yin
    2022 INTERNATIONAL CONFERENCE ON INDUSTRIAL AUTOMATION, ROBOTICS AND CONTROL ENGINEERING, IARCE, 2022, : 56 - 61