Comparison of CNNs and Vision Transformers-Based Hybrid Models Using Gradient Profile Loss for Classification of Oil Spills in SAR Images

被引：17

作者：

Basit, Abdul ^{[1
]}

Siddique, Muhammad Adnan ^{[1
]}

Bhatti, Muhammad Khurram ^{[1
]}

Sarfraz, Muhammad Saquib ^{[2
]}

机构：

[1] Informat Technol Univ Punjab ITU, Remote Sensing & Spatial Analyt Lab, Lahore 54000, Pakistan

[2] Karlsruhe Inst Technol KIT, Inst Anthropomat & Robot, D-76131 Karlsruhe, Germany

来源：

REMOTE SENSING | 2022年 / 14卷 / 09期

关键词：

oil spills; synthetic aperture radar (SAR); deep convolutional neural networks (DCNNs); vision transformers (ViTs); deep learning; semantic segmentation; marine pollution; remote sensing; NEURAL-NETWORK; SEGMENTATION;

D O I：

10.3390/rs14092085

中图分类号：

X [环境科学、安全科学];

学科分类号：

08 ; 0830 ;

摘要：

Oil spillage over a sea or ocean surface is a threat to marine and coastal ecosystems. Spaceborne synthetic aperture radar (SAR) data have been used efficiently for the detection of oil spills due to their operational capability in all-day all-weather conditions. The problem is often modeled as a semantic segmentation task. The images need to be segmented into multiple regions of interest such as sea surface, oil spill, lookalikes, ships, and land. Training of a classifier for this task is particularly challenging since there is an inherent class imbalance. In this work, we train a convolutional neural network (CNN) with multiple feature extractors for pixel-wise classification and introduce a new loss function, namely, "gradient profile" (GP) loss, which is in fact the constituent of the more generic spatial profile loss proposed for image translation problems. For the purpose of training, testing, and performance evaluation, we use a publicly available dataset with selected oil spill events verified by the European Maritime Safety Agency (EMSA). The results obtained show that the proposed CNN trained with a combination of GP, Jaccard, and focal loss functions can detect oil spills with an intersection over union (IoU) value of 63.95%. The IoU value for sea surface, lookalikes, ships, and land class is 96.00%, 60.87%, 74.61%, and 96.80%, respectively. The mean intersection over union (mIoU) value for all the classes is 78.45%, which accounts for a 13% improvement over the state of the art for this dataset. Moreover, we provide extensive ablation on different convolutional neural networks (CNNs) and vision transformers (ViTs)-based hybrid models to demonstrate the effectiveness of adding GP loss as an additional loss function for training. Results show that GP loss significantly improves the mIoU and F-1 scores for CNNs as well as ViTs-based hybrid models. GP loss turns out to be a promising loss function in the context of deep learning with SAR images.

引用

页数：18

共 46 条

[1] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].

Badrinarayanan, Vijay ;

Kendall, Alex ;

Cipolla, Roberto .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495

[2] DEEP LEARNING BASED OIL SPILL CLASSIFICATION USING UNET CONVOLUTIONAL NEURAL NETWORK [J].

Basit, Abdul ;

Siddique, Muhammad A. ;

Sarfraz, M. Saquib .

2021 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM IGARSS, 2021, :3491-3494

[3] Attention Augmented Convolutional Networks [J].

Bello, Irwan ;

Zoph, Barret ;

Vaswani, Ashish ;

Shlens, Jonathon ;

Le, Quoc V. .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :3285-3294

[4] Snow Avalanche Segmentation in SAR Images With Fully Convolutional Neural Networks [J].

Bianchi, Filippo Maria ;

Grahn, Jakob ;

Eckerstorfer, Markus ;

Malnes, Eirik ;

Vickers, Hannah .

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2021, 14 :75-82

[5]

Brown TB, 2020, ADV NEUR IN, V33

[6] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].

Chen, Liang-Chieh ;

Papandreou, George ;

Kokkinos, Iasonas ;

Murphy, Kevin ;

Yuille, Alan L. .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848

[7] An End-to-End Oil-Spill Monitoring Method for Multisensory Satellite Images Based on Deep Semantic Segmentation [J].

Chen, Yantong ;

Li, Yuyang ;

Wang, Junsheng .

SENSORS, 2020, 20 (03)

[8]

Dai Z., 2021, arXiv

[9]

Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171

[10] Convolutional Neural Network With Data Augmentation for SAR Target Recognition [J].

Ding, Jun ;

Chen, Bo ;

Liu, Hongwei ;

Huang, Mengyuan .

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2016, 13 (03) :364-368

← 1 2 3 4 5 →