Swin Transformer for Pedestrian and Occluded Pedestrian Detection

被引:1
作者
Liang, Jung-An [1 ]
Ding, Jian-Jiun [1 ]
机构
[1] Natl Taiwan Univ, Grad Inst Commun Engn, Taipei, Taiwan
来源
2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024 | 2024年
关键词
transformer; deep learning; computer vision; object detection; autonomous driving system;
D O I
10.1109/ISCAS58744.2024.10558302
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Pedestrian recognition is crucial for computer vision and self-driving system design. In this work, the Swin Transformer (SwinT), which can capture global contextual information and handle long-range dependencies, is adopted to perform pedestrian detection in the complex scene, including the heavy occlusion scenario. The SwinT is capable to capture multi-scale features and spatial relationships in images, making it well-suited for the challenging task of occluded pedestrian detection. We also apply a two -stage detector based on the faster R-CNN framework, which consists of a cascade region proposal network (RPN) and a region of interest (ROI) head, and use anchors and the focal loss during the RPN training process. The experiments conducted on Euro City Persons and CityPersons datasets demonstrate the outstanding performance of the proposed architecture in detecting heavily occluded pedestrians, highlighting its ability to handle challenging scenarios that traditional methods may struggle with.
引用
收藏
页数:5
相关论文
共 27 条
[21]   The biological function and the regulatory roles of wild-type p53-induced phosphatase 1 in immune system [J].
Shi, Lu ;
Tian, Qianchuan ;
Feng, Chang ;
Zhang, Peng ;
Zhao, Yong .
INTERNATIONAL REVIEWS OF IMMUNOLOGY, 2020, 39 (06) :280-291
[22]  
Vu T., 2019, ADV NEUR IN, P32
[23]   Repulsion Loss: Detecting Pedestrians in a Crowd [J].
Wang, Xinlong ;
Xiao, Tete ;
Jiang, Yuning ;
Shao, Shuai ;
Sun, Jian ;
Shen, Chunhua .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7774-7783
[24]  
Wu YX, 2020, INT J COMPUT VISION, V128, P742, DOI [10.1007/s11263-019-01198-w, 10.1109/CSTIC.2018.8369274]
[25]   Occlusion-Aware R-CNN: Detecting Pedestrians in a Crowd [J].
Zhang, Shifeng ;
Wen, Longyin ;
Bian, Xiao ;
Lei, Zhen ;
Li, Stan Z. .
COMPUTER VISION - ECCV 2018, PT III, 2018, 11207 :657-674
[26]   Cascade R-CNN: Delving into High Quality Object Detection [J].
Cai, Zhaowei ;
Vasconcelos, Nuno .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6154-6162
[27]   Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation [J].
Zheng, Zhaohui ;
Wang, Ping ;
Ren, Dongwei ;
Liu, Wei ;
Ye, Rongguang ;
Hu, Qinghua ;
Zuo, Wangmeng .
IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (08) :8574-8586