End-to-End Single Shot Detector Using Graph-Based Learnable Duplicate Removal

被引:1
作者
Ding, Shuxiao [1 ,2 ]
Rehder, Eike [1 ]
Schneider, Lukas [1 ]
Cordts, Marius [1 ]
Gall, Juergen [2 ]
机构
[1] Mercedes Benz AG, Stuttgart, Germany
[2] Univ Bonn, Bonn, Germany
来源
PATTERN RECOGNITION, DAGM GCPR 2022 | 2022年 / 13485卷
关键词
End-to-end detection; learning duplicate removal; relationship modeling;
D O I
10.1007/978-3-031-16788-1_23
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Non-Maximum Suppression (NMS) is widely used to remove duplicates in object detection. In strong disagreement with the deep learning paradigm, NMS often remains as the only heuristic step. Learning NMS methods have been proposed that are either designed for Faster-RCNN or rely on separate networks. In contrast, learning NMS for SSD models is not well investigated. In this paper, we show that even a very simple rescoring network can be trained end-to-end with an underlying SSD model to solve the duplicate removal problem efficiently. For this, detection scores and boxes are refined from image features by modeling relations between detections in a Graph Neural Network (GNN). Our approach is applicable to the large number of object proposals in SSD using a pre-filtering head. It can easily be employed in arbitrary SSD-like models with weight-shared box predictor. Experiments on MS-COCO and KITTI show that our method improves accuracy compared with other duplicate removal methods at significantly lower inference time.
引用
收藏
页码:375 / 389
页数:15
相关论文
共 28 条
[1]  
[Anonymous], 2010, International journal of computer vision, DOI DOI 10.1007/s11263-009-0275-4
[2]   End-to-End Object Detection with Transformers [J].
Carion, Nicolas ;
Massa, Francisco ;
Synnaeve, Gabriel ;
Usunier, Nicolas ;
Kirillov, Alexander ;
Zagoruyko, Sergey .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229
[3]  
Dai JF, 2016, ADV NEUR IN, V29
[4]  
Howard AG, 2017, Arxiv, DOI [arXiv:1704.04861, DOI 10.48550/ARXIV.1704.04861]
[5]  
Geiger A, 2012, PROC CVPR IEEE, P3354, DOI 10.1109/CVPR.2012.6248074
[6]  
He KM, 2020, IEEE T PATTERN ANAL, V42, P386, DOI [10.1109/ICCV.2017.322, 10.1109/TPAMI.2018.2844175]
[7]   Learning non-maximum suppression [J].
Hosang, Jan ;
Benenson, Rodrigo ;
Schiele, Bernt .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6469-6477
[8]   A Convnet for Non-maximum Suppression [J].
Hosang, Jan ;
Benenson, Rodrigo ;
Schiele, Bernt .
PATTERN RECOGNITION, GCPR 2016, 2016, 9796 :192-204
[9]   Relation Networks for Object Detection [J].
Hu, Han ;
Gu, Jiayuan ;
Zhang, Zheng ;
Dai, Jifeng ;
Wei, Yichen .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3588-3597
[10]   Speed/accuracy trade-offs for modern convolutional object detectors [J].
Huang, Jonathan ;
Rathod, Vivek ;
Sun, Chen ;
Zhu, Menglong ;
Korattikara, Anoop ;
Fathi, Alireza ;
Fischer, Ian ;
Wojna, Zbigniew ;
Song, Yang ;
Guadarrama, Sergio ;
Murphy, Kevin .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :3296-+