LLA: Loss-aware label assignment for dense pedestrian detection

被引:40
作者
Ge, Zheng [1 ,4 ]
Wang, Jianfeng [4 ]
Huang, Xin [1 ]
Liu, Songtao [4 ]
Yoshie, Osamu [2 ,3 ]
机构
[1] Waseda Univ, Grad Sch Informat Prod & Syst, Tokyo, Japan
[2] Waseda Univ, Tokyo, Japan
[3] Waseda Univ, Acad Fus, Inst Global Strategies Ind, IGSIAF, Tokyo, Japan
[4] Megvii Technol, Beijing, Peoples R China
关键词
Pedestrain detection; Label assignment; Occlusion aware;
D O I
10.1016/j.neucom.2021.07.094
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Label assignment has been widely studied in general object detection because of its great impact on detectors' performance. In the field of dense pedestrian detection, human bodies are often heavily entangled, making label assignment more important. However, none of the existing label assignment method focuses on crowd scenarios. Motivated by this, we propose Loss-aware Label Assignment (LLA) to boost the performance of pedestrian detectors in crowd scenarios. Concretely, LLA first calculates classification (cls) and regression (reg) losses between each anchor and ground-truth (GT) pair. A joint loss is then defined as the weighted summation of cls and reg losses as the assigning indicator. Finally, anchors with top K minimum joint losses for a certain GT box are assigned as its positive anchors. Anchors that are not assigned to any GT box are considered negative. LLA is simple but effective. Experiments on CrowdHuman and CityPersons show that such a simple label assigning strategy can boost MR by 9.53% and 5.47% on two famous one-stage detectors - RetinaNet and FCOS, becoming the first one-stage detector that surpasses Faster R-CNN in crowd scenarios. (c) 2021 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http:// creativecommons.org/licenses/by/4.0/).
引用
收藏
页码:272 / 281
页数:10
相关论文
共 43 条
[1]   Prime Sample Attention in Object Detection [J].
Cao, Yuhang ;
Chen, Kai ;
Loy, Chen Change ;
Lin, Dahua .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11580-11588
[2]  
Carion N., 2023, End-toend object detection with transformers
[3]   Beyond triplet loss: a deep quadruplet network for person re-identification [J].
Chen, Weihua ;
Chen, Xiaotang ;
Zhang, Jianguo ;
Huang, Kaiqi .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1320-1329
[4]   Detection in Crowded Scenes: One Proposal, Multiple Predictions [J].
Chu, Xuangeng ;
Zheng, Anlin ;
Zhang, Xiangyu ;
Sun, Jian .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :12211-12220
[5]   Pedestrian Detection: An Evaluation of the State of the Art [J].
Dollar, Piotr ;
Wojek, Christian ;
Schiele, Bernt ;
Perona, Pietro .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (04) :743-761
[6]   Design and Simulation Evaluation of Pneumatic Tool Handle [J].
Ge Zhenghao ;
Wei Siyuan ;
Wei Tao .
2020 IEEE 3RD INTERNATIONAL CONFERENCE ON MECHATRONICS, ROBOTICS AND AUTOMATION (ICMRA 2020), 2020, :1-5
[7]   Fast R-CNN [J].
Girshick, Ross .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448
[8]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[9]   NMS by Representative Region: Towards Crowded Pedestrian Detection by Proposal Pairing [J].
Huang, Xin ;
Ge, Zheng ;
Jie, Zequn ;
Yoshie, Osamu .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :10747-10756
[10]  
Kim K., ARXIV PREPRINT ARXIV