MULTI-SCALE DEFORMABLE TRANSFORMER ENCODER BASED SINGLE-STAGE PEDESTRIAN DETECTION

被引:1
作者
Yuan, Jing [1 ]
Barmpoutis, Panagiotis [2 ]
Stathaki, Tania [1 ]
机构
[1] Imperial Coll London, Dept Elect & Elect Engn, London, England
[2] UCL, Dept Comp Sci, London, England
来源
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP | 2022年
关键词
Pedestrian detection; single-stage method; vision transformer; NETWORK;
D O I
10.1109/ICIP46576.2022.9897361
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pedestrian detection is a key task in intelligent video surveillance systems which requires both fast inference and high detection accuracy. Although single-stage deep learning pedestrian detectors have achieved relatively high detection accuracy with simpler architecture and less inference time, their performance is limited compared to two-stage methods. The reason is the lack of scale-aware features without the assistance of proposal regions. To overcome this, a multiscale deformable transformer encoder-based module is proposed. It can extract the sparse important features at deformable sampling locations from multiple levels. The proposed architecture significantly improves the performance compared to the baseline center and scale prediction method on both Caltech and Citypersons datasets. It even outperforms the state-of-the-art two-stage methods in detecting heavily occluded pedestrians on Citypersons validation set.
引用
收藏
页码:2906 / 2910
页数:5
相关论文
共 33 条
  • [1] Carion N., 2020, EUROPEAN C COMPUTER, V12346, P213, DOI 10.1007/978-3-030-58452-8_13
  • [2] Chen WH, 2017, AAAI CONF ARTIF INTE, P3988
  • [3] Chi C, 2020, AAAI CONF ARTIF INTE, V34, P10639
  • [4] Histograms of oriented gradients for human detection
    Dalal, N
    Triggs, B
    [J]. 2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, : 886 - 893
  • [5] Pedestrian Detection: An Evaluation of the State of the Art
    Dollar, Piotr
    Wojek, Christian
    Schiele, Bernt
    Perona, Pietro
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (04) : 743 - 761
  • [6] Dosovitskiy A., 2020, INT C LEARN REPR
  • [7] Generalizable Pedestrian Detection: The Elephant In The Room
    Hasan, Irtiza
    Liao, Shengcai
    Li, Jinpeng
    Akram, Saad Ullah
    Shao, Ling
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 11323 - 11332
  • [8] Lin Matthieu, 2020, ARXIV06785
  • [9] Microsoft COCO: Common Objects in Context
    Lin, Tsung-Yi
    Maire, Michael
    Belongie, Serge
    Hays, James
    Perona, Pietro
    Ramanan, Deva
    Dollar, Piotr
    Zitnick, C. Lawrence
    [J]. COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 : 740 - 755
  • [10] Learning Efficient Single-Stage Pedestrian Detectors by Asymptotic Localization Fitting
    Liu, Wei
    Liao, Shengcai
    Hu, Weidong
    Liang, Xuezhi
    Chen, Xiao
    [J]. COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 : 643 - 659