Performance Analysis of Semantic Segmentation Algorithms for Finely Annotated New UAV Aerial Video Dataset (ManipalUAVid)

被引：24

作者：

Girisha, S. ^{[1
]}

Pai, Manohara M. M. ^{[1
]}

Verma, Ujjwal ^{[2
]}

Pai, Radhika M. ^{[1
]}

机构：

[1] Manipal Acad Higher Educ, Manipal Inst Technol, Dept Informat & Commun Technol, Manipal 576104, India

[2] Manipal Acad Higher Educ, Manipal Inst Technol, Dept Elect & Commun Engn, Manipal 576104, India

来源：

IEEE ACCESS | 2019年 / 7卷

关键词：

Semantics; Roads; Image segmentation; Unmanned aerial vehicles; Standards; Buildings; Education; Convolutional neural networks; semantic segmentation; shot boundary detection; UAV video;

D O I：

10.1109/ACCESS.2019.2941026

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Semantic segmentation of videos helps in scene understanding, thereby assisting in other automated video processing techniques like anomaly detection, object detection, event detection, etc. However, there has been limited study on semantic segmentation of videos acquired using Unmanned Aerial Vehicles (UAV), primarily due to the absence of standard dataset. In this paper, a new UAV aerial video dataset (ManipalUAVid) for semantic segmentation is presented. The videos have been acquired in a closed university campus, and fine annotation is provided for four background classes viz. constructions, greeneries, roads, and waterbodies. Also, the performance of four semantic segmentation approaches: Conditional Random Field (CRF), U-Net, Fully Convolutional Network (FCN) and DeepLabV3+ are analysed on ManipalUAVid dataset. It is seen that these algorithms perform competitively on UAV aerial video dataset and achieves an mIoU of 0.86, 0.86, 0.86 and 0.83 respectively.

引用

页码：136239 / 136253

页数：15

共 36 条

[1] Pedestrian detection for UAVs using cascade classifiers with Meanshift [J].

Aguilar, Wilbert G. ;

Luna, Marco A. ;

Moya, Julio F. ;

Abad, Vanessa ;

Parra, Humberto ;

Ruiz, Hugo .

2017 11TH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2017, :509-514

[2]

[Anonymous], 2007, INT J COMPUTER VISIO, DOI DOI 10.1007/S11263-007-0109-1

[3]

[Anonymous], 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition

[4]

[Anonymous], 2014, P 2014 IND C COMP VI, DOI [10.1145/2683483.2683539, DOI 10.1145/2683483.2683539]

[5] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].

Badrinarayanan, Vijay ;

Kendall, Alex ;

Cipolla, Roberto .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495

[6] Okutama-Action: An Aerial View Video Dataset for Concurrent Human Action Detection [J].

Barekatain, Mohammadamin ;

Marti, Miquel ;

Shih, Hsueh-Fu ;

Murray, Samuel ;

Nakayama, Kotaro ;

Matsuo, Yutaka ;

Prendinger, Helmut .

2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, :2153-2160

[7] Segmentation and Recognition Using Structure from Motion Point Clouds [J].

Brostow, Gabriel J. ;

Shotton, Jamie ;

Fauqueur, Julien ;

Cipolla, Roberto .

COMPUTER VISION - ECCV 2008, PT I, PROCEEDINGS, 2008, 5302 :44-+

[8]

Chen LC, 2014, ARXIV

[9] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].

Chen, Liang-Chieh ;

Papandreou, George ;

Kokkinos, Iasonas ;

Murphy, Kevin ;

Yuille, Alan L. .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848

[10]

Chen Liang-Chieh, 2018, ECCV, P801, DOI [DOI 10.1007/978-3-030-01234-249, 10.1007/978-3-030-01234-2_49]

← 1 2 3 4 →