CA-YOLOX: Deep Learning-Guided Road Intersection Location From High-Resolution Remote Sensing Images

被引:0
作者
Li, Chengfan [1 ,2 ,3 ]
Zhang, Zixuan [4 ]
Liu, Lan [5 ]
Wang, Shengnan [4 ]
Zhao, Junjuan [4 ]
Liu, Xuefeng [6 ]
机构
[1] Shanghai Univ, Sch Comp Engn & Sci, Shanghai 200444, Peoples R China
[2] Wuhan Univ, Key Lab Natl Geog Census & Monitoring, Minist Nat Resources, Wuhan 430079, Peoples R China
[3] East China Univ Technol, Key Lab Digital Land & Resources Jiangxi Prov, Nanchang 330013, Peoples R China
[4] Shanghai Univ, Sch Comp Engn & Sci, Shangda 99, Shanghai 200444, Peoples R China
[5] Shanghai Univ Engn Sci, Sch Elect & Elect Engn, Longteng 333, Shanghai 201620, Peoples R China
[6] Shanghai Univ, Sch Commun & Informat Engn, Shangda 99, Shanghai 200444, Peoples R China
基金
上海市自然科学基金;
关键词
Road intersection; object location; deep learning; coordinate attention (CA); high-resolution remote sensing (HRRS) image; OBJECT DETECTION; NEURAL-NETWORK;
D O I
10.1142/S0218001423510175
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The location of road intersection from high resolution remote sensing (HRRS) images can be automatically obtained by deep learning. This has become one of the current data sources in urban smart transportation. However, limited by the small size, diverse types, complex distribution, and missing sample labels of road intersections in actual scenarios, it is difficult to accurately represent key features of road intersection by deep neural network (DNN) model. A new coordinate attention (CA) module-YOLOX (CA-YOLOX) method for accurately locating road intersections from HRRS images is presented. First, the spatial pyramid pooling (SPP) module is introduced into the backbone convolution network between the Darknet-53' last feature layer and feature pyramid networks (FPN) structure. Second, the CA module is embedded into the feature fusion structure in FPN to focus more on the spatial shape distribution and texture features of road intersections. Third, we use focal loss to replace the traditional binary cross entropy (BCE) loss in the confidence loss to improve the iteration speed of the CA-YOLOX network. Finally, an extensive empirical experiment on Potsdam, IKONOS datasets, and ablation study is then implemented and tested. The results show that the presented CA-YOLOX method can promote the location accuracy of road intersection from HRRS images compared to the traditional You only look once (YOLO) model.
引用
收藏
页数:26
相关论文
共 41 条
  • [1] Bochkovskiy A., 2020, ARXIV, DOI DOI 10.48550/ARXIV.2004.10934
  • [2] Cao Wen, 2018, Geomatics and Information Science of Wuhan University, V43, P413, DOI 10.13203/j.whugis20150203
  • [3] Chen Y. Z., 2019, TRANSP SCI TECHNOL, V297, P99
  • [4] Generating urban road intersection models from low-frequency GPS trajectory data
    Deng, Min
    Huang, Jincai
    Zhang, Yunfei
    Liu, Huimin
    Tang, Luliang
    Tang, Jianbo
    Yang, Xuexi
    [J]. INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE, 2018, 32 (12) : 2337 - 2361
  • [5] Dual Attention Network for Scene Segmentation
    Fu, Jun
    Liu, Jing
    Tian, Haijie
    Li, Yong
    Bao, Yongjun
    Fang, Zhiwei
    Lu, Hanqing
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3141 - 3149
  • [6] Rotation-aware and multi-scale convolutional neural network for object detection in remote sensing images
    Fu, Kun
    Chang, Zhonghan
    Zhang, Yue
    Xu, Guangluan
    Zhang, Keshu
    Sun, Xian
    [J]. ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2020, 161 (161) : 294 - 308
  • [7] Ge Z., 2021, Yolox: Exceeding yolo series in 2021, DOI [DOI 10.48550/ARXIV.2107.08430, 10.48550/ARXIV.2107.08430]
  • [8] He KM, 2014, LECT NOTES COMPUT SC, V8691, P346, DOI [arXiv:1406.4729, 10.1007/978-3-319-10578-9_23]
  • [9] Coordinate Attention for Efficient Mobile Network Design
    Hou, Qibin
    Zhou, Daquan
    Feng, Jiashi
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 13708 - 13717
  • [10] Hu J, 2018, PROC CVPR IEEE, P7132, DOI [10.1109/CVPR.2018.00745, 10.1109/TPAMI.2019.2913372]