AE-Net: A High Accuracy and Efficient Network for Railway Obstacle Detection Based on Convolution and Transformer

被引:3
作者
Zhao, Zongyang [1 ]
Kang, Jiehu [1 ]
Wu, Bin [1 ]
Ye, Tao [2 ]
Liang, Jian [1 ]
机构
[1] Tianjin Univ, State Key Lab Precis Measuring Technol & Instrume, Tianjin 300072, Peoples R China
[2] China Univ Min & Technol, Sch Mech Elect & Informat Engn, Beijing 100083, Peoples R China
基金
中国国家自然科学基金;
关键词
Convolutional neural network (CNN); deep learning; obstacle detection; railway traffic; transformer;
D O I
10.1109/TIM.2024.3372216
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The incursion of railway obstacles poses a serious risk to train operations, and numerous accidents occur during train shunting. However, existing algorithms still struggle with finding a compromise between detection accuracy and speed during train movement. Moreover, their accuracy and robustness are inadequate, specifically when handling small objects in complicated railway scenarios. To overcome these issues, this article proposes an efficient network using convolution and transformer (AE-Net) for performing accurate and real-time detection of railway obstacles to ensure driving safety. First, the enhanced and lightweight transformer module (ETM) is constructed to strengthen the model's global modeling ability. Then, the lightweight feature integration module (LIM) is presented to integrate multibranch feature information and reduce model complexity. Finally, the reinforced multiscale feature fusion module (RFM) is utilized to enhance the multiscale object detection capability, especially for small obstacles. The presented algorithm realizes 95.29% mAP and 145 frames/s on the railway dataset, which is superior to YOLOv5s. In addition, the experiment on MS COCO further shows that AE-Net can perform a considerably better detection than current state-of-the-art models. Hence, it is practicable to employ AE-Net in actual railway and further more complex multitarget scenarios.
引用
收藏
页码:1 / 14
页数:14
相关论文
共 54 条
[1]  
Ba J.L., 2016, arXiv preprint arXiv:1607.06450, DOI DOI 10.48550/ARXIV.1607.06450
[2]  
Bochkovskiy A, 2020, Arxiv, DOI [arXiv:2004.10934, 10.48550/arXiv.2004.10934, DOI 10.48550/ARXIV.2004.10934]
[3]   Multi-task learning for dangerous object detection in autonomous driving [J].
Chen, Yaran ;
Zhao, Dongbin ;
Lv, Le ;
Zhang, Qichao .
INFORMATION SCIENCES, 2018, 432 :559-571
[4]  
Cui X., 2019, Electron. Meas. Technol., V42, P144
[5]   CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows [J].
Dong, Xiaoyi ;
Bao, Jianmin ;
Chen, Dongdong ;
Zhang, Weiming ;
Yu, Nenghai ;
Yuan, Lu ;
Chen, Dong ;
Guo, Baining .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :12114-12124
[6]  
Dosovitskiy A, 2021, Arxiv, DOI [arXiv:2010.11929, DOI 10.48550/ARXIV.2010.11929]
[7]   Object Detection at Level Crossing Using Deep Learning [J].
Fayyaz, Muhammad Asad Bilal ;
Johnson, Christopher .
MICROMACHINES, 2020, 11 (12) :1-16
[8]  
Fedus W, 2022, Arxiv, DOI arXiv:2101.03961
[9]  
Fu C.Y., 2017, PREPRINT
[10]   Fast R-CNN [J].
Girshick, Ross .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448