Visual relationship detection based on bidirectional recurrent neural network

被引:6
作者
Dai, Yibo [1 ]
Wang, Chao [1 ]
Dong, Jian [1 ]
Sun, Changyin [1 ]
机构
[1] Southeast Univ, Sch Automat, Key Lab Measurement & Control Complex Syst Engn, Nanjing 210096, Peoples R China
基金
中国国家自然科学基金;
关键词
Detection; RNN; Visual relationship; NMS;
D O I
10.1007/s11042-019-7732-z
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Visual relationship detection is a task aiming at mining the information of interactions between the paired objects in the image, describing the image in the form of (subject - predicate - object). Most of the previous works regard it as a pure classification problem by taking the integrated triplets as the label of the image; however, the numerous combinations of objects and the diversity of predicates are the tough challenges for these studies. Hence, we propose a deep model based on a modified bidirectional recurrent neural network (BRNN) to classify object and predict predicate simultaneously. By using the BRNN, the hidden information of the relationship in the image is extracted and a feature-infusion method is proposed. Additionally, we improve the existing works by introducing a paired non-maximum suppression method. The experiments show that our approach is competitive with the state-of-the-art works.
引用
收藏
页码:35297 / 35313
页数:17
相关论文
共 41 条
[1]   VQA: Visual Question Answering [J].
Agrawal, Aishwarya ;
Lu, Jiasen ;
Antol, Stanislaw ;
Mitchell, Margaret ;
Zitnick, C. Lawrence ;
Parikh, Devi ;
Batra, Dhruv .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2017, 123 (01) :4-31
[2]  
[Anonymous], 2010, CVPR, DOI DOI 10.1109/CVPR.2010.5540112
[3]   Exploiting Hierarchical Context on a Large Database of Object Categories [J].
Choi, Myung Jin ;
Lim, Joseph J. ;
Torralba, Antonio ;
Willsky, Alan S. .
2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, :129-136
[4]   Detecting Visual Relationships with Deep Relational Networks [J].
Dai, Bo ;
Zhang, Yuqi ;
Lin, Dahua .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :3298-3308
[5]   Discriminative models for multi-class object layout [J].
Desai, Chaitanya ;
Ramanan, Deva ;
Fowlkes, Charless .
2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2009, :229-236
[6]   Learning Everything about Anything: Webly-Supervised Visual Concept Learning [J].
Divvala, Santosh K. ;
Farhadi, Ali ;
Guestrin, Carlos .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :3270-3277
[7]   Every Picture Tells a Story: Generating Sentences from Images [J].
Farhadi, Ali ;
Hejrati, Mohsen ;
Sadeghi, Mohammad Amin ;
Young, Peter ;
Rashtchian, Cyrus ;
Hockenmaier, Julia ;
Forsyth, David .
COMPUTER VISION-ECCV 2010, PT IV, 2010, 6314 :15-+
[8]  
Fidler S, 2007, IEEE C COMP VIS PATT, P1
[9]  
Galleguillos C, 2008, PROC CVPR IEEE, P3552
[10]   Context based object categorization: A critical survey [J].
Galleguillos, Carolina ;
Belongie, Serge .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2010, 114 (06) :712-722