Predicate Correlation Learning for Scene Graph Generation

被引:16
作者
Tao, Leitian [1 ]
Mi, Li [1 ]
Li, Nannan [1 ]
Cheng, Xianhang [1 ]
Hu, Yaosi [1 ]
Chen, Zhenzhong [1 ]
机构
[1] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan 430072, Peoples R China
基金
中国国家自然科学基金;
关键词
Image understanding; scene graph generation; long-tailed bias; predicate correlation; semantic overlap; IMBALANCED DATA; SMOTE;
D O I
10.1109/TIP.2022.3181511
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For a typical Scene Graph Generation (SGG) method in image understanding, there usually exists a large gap in the performance of the predicates' head classes and tail classes. This phenomenon is mainly caused by the semantic overlap between different predicates as well as the long-tailed data distribution. In this paper, a Predicate Correlation Learning (PCL) method for SGG is proposed to address the above problems by taking the correlation between predicates into consideration. To measure the semantic overlap between highly correlated predicate classes, a Predicate Correlation Matrix (PCM) is defined to quantify the relationship between predicate pairs, which is dynamically updated to remove the matrix's long-tailed bias. In addition, PCM is integrated into a predicate correlation loss function (L-PC) to reduce discouraging gradients of unannotated classes. The proposed method is evaluated on several benchmarks, where the performance of the tail classes is significantly improved when built on existing methods.
引用
收藏
页码:4173 / 4185
页数:13
相关论文
共 55 条
[1]  
Byrd J, 2019, PR MACH LEARN RES, V97
[2]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[3]   Knowledge-Embedded Routing Network for Scene Graph Generation [J].
Chen, Tianshui ;
Yu, Weihao ;
Chen, Riquan ;
Lin, Liang .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :6156-6164
[4]   Class-Balanced Loss Based on Effective Number of Samples [J].
Cui, Yin ;
Jia, Menglin ;
Lin, Tsung-Yi ;
Song, Yang ;
Belongie, Serge .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :9260-9269
[5]   Detecting Visual Relationships with Deep Relational Networks [J].
Dai, Bo ;
Zhang, Yuqi ;
Lin, Dahua .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :3298-3308
[6]   Progressive Cross-Modal Semantic Network for Zero-Shot Sketch-Based Image Retrieval [J].
Deng, Cheng ;
Xu, Xinxun ;
Wang, Hao ;
Yang, Muli ;
Tao, Dacheng .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :8892-8902
[7]  
Ghosh Shalini, 2019, ARXIV190205715
[8]   Unpaired Image Captioning via Scene Graph Alignments [J].
Gu, Jiuxiang ;
Joty, Shafiq ;
Cai, Jianfei ;
Zhao, Handong ;
Yang, Xu ;
Wang, Gang .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :10322-10331
[9]   Relation Regularized Scene Graph Generation [J].
Guo, Yuyu ;
Gao, Lianli ;
Song, Jingkuan ;
Wang, Peng ;
Sebe, Nicu ;
Shen, Heng Tao ;
Li, Xuelong .
IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (07) :5961-5972
[10]   LVIS: A Dataset for Large Vocabulary Instance Segmentation [J].
Gupta, Agrim ;
Dollar, Piotr ;
Girshick, Ross .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5351-5359