Swin transformer based vehicle detection in undisciplined traffic environment

被引:33
作者
Deshmukh, Prashant [1 ]
Satyanarayana, G. S. R. [1 ,2 ]
Majhi, Sudhan [3 ]
Sahoo, Upendra Kumar [1 ]
Das, Santos Kumar [1 ]
机构
[1] Natl Inst Technol Rourkela, Dept Elect & Commun Engn, Rourkela, India
[2] Vignans Fdn Sci Technol & Res, Dept Elect & Commun Engn, Guntur, India
[3] Indian Inst Sci, Dept Elect Commun Engn, Bangalore, India
关键词
Deep learning; Undisciplined traffic environment; Visual transformer; Vehicle detection; DETECTION SYSTEM; OBJECT DETECTION; CLASSIFICATION;
D O I
10.1016/j.eswa.2022.118992
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Intelligent vehicle detection (IVD) plays a prominent role in evolving an intelligent traffic management system (ITMS). It can help to decrease the average waiting time at the traffic post, save fuel consumption, control traffic congestion, decrease accident rates, and build up human safety. Recent developments in the artificial intelligence (AI) domain have increased the demand for IVD in the undisciplined traffic environment, which is a usual condition in developing countries. IVD is a difficult task in an undisciplined traffic environment because different vehicle categories travel very close to each other on the roads and do not follow traffic rules. Previously, several convolutional neural network (CNN) based deep learning (DL), and visual transformer-based techniques for vehicle and object detection have been presented. They are complex and do not accurately extract multi-scale features due to the involvement of existing CNN feature extraction backbones. Also, most techniques failed to account for an undisciplined traffic environment due to the unavailability of labeled vehicle datasets. Therefore, this paper proposes a swin transformer-based vehicle detection (STVD) framework in an undisciplined traffic environment. Swin transformer (ST) wholly exchanges information within and between image patches and provides hierarchical feature maps, effectively alleviating the multi-scale feature extraction problem. A bi-directional feature pyramid network (BIFPN) is presented, which combines low -resolution features with high-resolution features in a bidirectional way and provides robust multi-scale features with different scales and resolutions. A fully connected vehicle detection head (FCVDH) is applied to improve the matching relationship between vehicle sizes and the BIFPN hierarchy. FCVDH predicts the locations and categories of vehicles in the input image. STVD is analyzed, experimented, and measured over realistic traffic data. Also, it is compared with the existing state-of-the-art vehicle detection methods. It achieves 91.32% detection accuracy on diverse traffic labeled dataset (DTLD), 87.4% on IITM-hetra, and 88.45% on KITTI datasets.
引用
收藏
页数:13
相关论文
共 69 条
[1]   Faster RCNN based Vehicle Detection and Counting Framework for Undisciplined Traffic Conditions [J].
Ahmed, Syeda Hafsa ;
Raza, Mehwish ;
Mehdi, Syeda Shajeeha ;
Rehman, Inshal ;
Kazmi, Majida ;
Qazi, Saad Ahmed .
2021 IEEE 18TH INTERNATIONAL CONFERENCE ON SMART COMMUNITIES: IMPROVING QUALITY OF LIFE USING ICT, IOT AND AI (IEEE HONET 2021), 2021, :173-178
[2]   Multi-level refinement enriched feature pyramid network for object detection [J].
Aziz, Lubna ;
Salam, Md. Sah Bin Haji F. C. ;
Ayub, Sara .
IMAGE AND VISION COMPUTING, 2021, 115
[3]   Deep learning-based appearance features extraction for automated carp species identification [J].
Banan, Ashkan ;
Nasiri, Amin ;
Taheri-Garavand, Amin .
AQUACULTURAL ENGINEERING, 2020, 89
[4]  
Bergstra J, 2012, J MACH LEARN RES, V13, P281
[5]   On Generalizing Detection Models for Unconstrained Environments [J].
Bhargava, Prajjwal .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, :4296-4301
[6]  
Can V. X., 2021, INT J ADV RES ENG TE, V2
[7]  
Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
[8]  
Chen KH, 2018, INT CONF MACH LEARN, P467, DOI 10.1109/ICMLC.2018.8526958
[9]  
Chen P., 2022, arXiv
[10]   Accurate discharge coefficient prediction of streamlined weirs by coupling linear regression and deep convolutional gated recurrent unit [J].
Chen, Weibin ;
Sharifrazi, Danial ;
Liang, Guoxi ;
Band, Shahab S. ;
Chau, Kwok Wing ;
Mosavi, Amir .
ENGINEERING APPLICATIONS OF COMPUTATIONAL FLUID MECHANICS, 2022, 16 (01) :965-976