Swin Transformer for Pedestrian and Occluded Pedestrian Detection

被引:1
作者
Liang, Jung-An [1 ]
Ding, Jian-Jiun [1 ]
机构
[1] Natl Taiwan Univ, Grad Inst Commun Engn, Taipei, Taiwan
来源
2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024 | 2024年
关键词
transformer; deep learning; computer vision; object detection; autonomous driving system;
D O I
10.1109/ISCAS58744.2024.10558302
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Pedestrian recognition is crucial for computer vision and self-driving system design. In this work, the Swin Transformer (SwinT), which can capture global contextual information and handle long-range dependencies, is adopted to perform pedestrian detection in the complex scene, including the heavy occlusion scenario. The SwinT is capable to capture multi-scale features and spatial relationships in images, making it well-suited for the challenging task of occluded pedestrian detection. We also apply a two -stage detector based on the faster R-CNN framework, which consists of a cascade region proposal network (RPN) and a region of interest (ROI) head, and use anchors and the focal loss during the RPN training process. The experiments conducted on Euro City Persons and CityPersons datasets demonstrate the outstanding performance of the proposed architecture in detecting heavily occluded pedestrians, highlighting its ability to handle challenging scenarios that traditional methods may struggle with.
引用
收藏
页数:5
相关论文
共 27 条
[1]  
Braun M., 2018, ARXIV
[2]   EuroCity Persons: A Novel Benchmark for Person Detection in Traffic Scenes [J].
Braun, Markus ;
Krebs, Sebastian ;
Flohr, Fabian ;
Gavrila, Dariu M. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (08) :1844-1861
[3]   Cascade R-CNN: High Quality Object Detection and Instance Segmentation [J].
Cai, Zhaowei ;
Vasconcelos, Nuno .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (05) :1483-1498
[4]  
Cai Zhaowei., 2016, ECCV, DOI DOI 10.1007/978-3-319-46493-0_22
[5]   End-to-End Object Detection with Transformers [J].
Carion, Nicolas ;
Massa, Francisco ;
Synnaeve, Gabriel ;
Usunier, Nicolas ;
Kirillov, Alexander ;
Zagoruyko, Sergey .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229
[6]  
Chen K., 2020, ARXIV
[7]   Beyond triplet loss: a deep quadruplet network for person re-identification [J].
Chen, Weihua ;
Chen, Xiaotang ;
Zhang, Jianguo ;
Huang, Kaiqi .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1320-1329
[8]   Pedestrian Detection: An Evaluation of the State of the Art [J].
Dollar, Piotr ;
Wojek, Christian ;
Schiele, Bernt ;
Perona, Pietro .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (04) :743-761
[9]  
Dosovitskiy Alexey., 2021, PROC INT C LEARN REP, P2021
[10]   NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection [J].
Ghiasi, Golnaz ;
Lin, Tsung-Yi ;
Le, Quoc V. .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :7029-7038