SkyScapes - Fine-Grained Semantic Understanding of Aerial Scenes

被引:43
作者
Azimi, Seyed Majid [1 ]
Henry, Corentin [1 ]
Sommer, Lars [2 ]
Schumann, Arne [2 ]
Vig, Eleonora [1 ]
机构
[1] German Aerosp Ctr DLR, Wessling, Germany
[2] Fraunhofer IOSB, Karlsruhe, Germany
来源
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019) | 2019年
关键词
D O I
10.1109/ICCV.2019.00749
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Understanding the complex urban infrastructure with centimeter-level accuracy is essential for many applications from autonomous driving to mapping, infrastructure monitoring, and urban management. Aerial images provide valuable information over a large area instantaneously; nevertheless, no current dataset captures the complexity of aerial scenes at the level of granularity required by real-world applications. To address this, we introduce SkyScapes, an aerial image dataset with highly-accurate, fine-grained annotations for pixel-level semantic labeling. SkyScapes provides annotations for 31 semantic categories ranging from large structures, such as buildings, roads and vegetation, to fine details, such as 12 (sub-)categories of lane markings. We have defined two main tasks on this dataset: dense semantic segmentation and multi-class lane-marking prediction. We carry out extensive experiments to evaluate state-of-the-art segmentation methods on SkyScapes. Existing methods struggle to deal with the wide range of classes, object sizes, scales, and fine details present. We therefore propose a novel multi-task model, which incorporates semantic edge detection and is better tuned for feature extraction from a wide range of scales. This model achieves notable improvements over the baselines in region outlines and level of detail on both tasks.
引用
收藏
页码:7392 / 7402
页数:11
相关论文
共 49 条
[1]  
[Anonymous], 2011, P INT C SENS MOD PHO
[2]  
[Anonymous], 2017, arXiv preprint arXiv:1706.05587, DOI DOI 10.48550/ARXIV.1706.05587
[3]   Aerial LaneNet: Lane-Marking Semantic Segmentation in Aerial Imagery Using Wavelet-Enhanced Cost-Sensitive Symmetric Fully Convolutional Neural Networks [J].
Azimi, Seyed Majid ;
Fischer, Peter ;
Koerner, Marco ;
Reinartz, Peter .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2019, 57 (05) :2920-2938
[4]  
Azimi Seyed Majid, 2018, P AS C COMP VIS ACCV
[5]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[6]  
CARNEIRO R V., 2018, 2018 International Joint Conference on Neural Networks (IJCNN), P1
[7]  
Chen LC, 2014, ARXIV
[8]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[9]  
Chen Liang-Chieh, 2018, ECCV, P801, DOI [DOI 10.1007/978-3-030-01234-249, 10.1007/978-3-030-01234-2_49]
[10]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223