SkyScapes - Fine-Grained Semantic Understanding of Aerial Scenes

被引:43
作者
Azimi, Seyed Majid [1 ]
Henry, Corentin [1 ]
Sommer, Lars [2 ]
Schumann, Arne [2 ]
Vig, Eleonora [1 ]
机构
[1] German Aerosp Ctr DLR, Wessling, Germany
[2] Fraunhofer IOSB, Karlsruhe, Germany
来源
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019) | 2019年
关键词
D O I
10.1109/ICCV.2019.00749
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Understanding the complex urban infrastructure with centimeter-level accuracy is essential for many applications from autonomous driving to mapping, infrastructure monitoring, and urban management. Aerial images provide valuable information over a large area instantaneously; nevertheless, no current dataset captures the complexity of aerial scenes at the level of granularity required by real-world applications. To address this, we introduce SkyScapes, an aerial image dataset with highly-accurate, fine-grained annotations for pixel-level semantic labeling. SkyScapes provides annotations for 31 semantic categories ranging from large structures, such as buildings, roads and vegetation, to fine details, such as 12 (sub-)categories of lane markings. We have defined two main tasks on this dataset: dense semantic segmentation and multi-class lane-marking prediction. We carry out extensive experiments to evaluate state-of-the-art segmentation methods on SkyScapes. Existing methods struggle to deal with the wide range of classes, object sizes, scales, and fine details present. We therefore propose a novel multi-task model, which incorporates semantic edge detection and is better tuned for feature extraction from a wide range of scales. This model achieves notable improvements over the baselines in region outlines and level of detail on both tasks.
引用
收藏
页码:7392 / 7402
页数:11
相关论文
共 49 条
[21]  
Lamon Pierre, 2006, IEEE RSJ IROS WORKSH, V26
[22]   RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation [J].
Lin, Guosheng ;
Milan, Anton ;
Shen, Chunhua ;
Reid, Ian .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :5168-5177
[23]   Microsoft COCO: Common Objects in Context [J].
Lin, Tsung-Yi ;
Maire, Michael ;
Belongie, Serge ;
Hays, James ;
Perona, Pietro ;
Ramanan, Deva ;
Dollar, Piotr ;
Zitnick, C. Lawrence .
COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 :740-755
[24]   Remixed Reality: Manipulating Space and Time in Augmented Reality [J].
Lindlbauer, David ;
Wilson, Andrew D. .
PROCEEDINGS OF THE 2018 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI 2018), 2018,
[25]   Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation [J].
Liu, Chenxi ;
Chen, Liang-Chieh ;
Schroff, Florian ;
Adam, Hartwig ;
Hua, Wei ;
Yuille, Alan ;
Li Fei-Fei .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :82-92
[26]  
Long J, 2015, PROC CVPR IEEE, P3431, DOI 10.1109/CVPR.2015.7298965
[27]  
Lyu Ye, 2018, ARXIV181010438
[28]  
Maggiori Emmanuel, 2017, IEEE INT GEOSC REM S
[29]   DeepRoadMapper: Extracting Road Topology from Aerial Images [J].
Mattyus, Gelert ;
Luo, Wenjie ;
Urtasun, Raquel .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :3458-3466
[30]   HD Maps: Fine-grained Road Segmentation by Parsing Ground and Aerial Images [J].
Mattyus, Gellert ;
Wang, Shenlong ;
Fidler, Sanja ;
Urtasun, Raquel .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3611-3619