Artificial intelligence based semantic segmentation on aerial images with variational mode decomposition

被引:0
作者
Vijai, Anupa [1 ]
Padmavathi, S. [1 ]
Venkataraman, D. [1 ]
机构
[1] Amrita Vishwa Vidyapeetham, Dept Comp Sci & Engn, Amrita Sch Comp, Coimbatore, India
关键词
Aerial images; Mode selection; Semantic segmentation; Variational mode decomposition; Transformer; NETWORK;
D O I
10.1016/j.engappai.2025.111140
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Segmentation of objects in an image is a significant task in the domain of computer vision that entails algorithms customized to the nature of images and the type of their constituent objects. Although modern deep learning models apropos aerial images deliver state-of-the-art performance, major challenges in their segmentation of aerial images are fewer training samples, and segmentation of images with small and multiple objects. This paper proposes Variational Mode Decomposition (VMD) based semantic segmentation on aerial images for six different classes of objects. VMD extracts the predominant frequency components in an image, offering a better representation of the input image before segmentation. These frequency components are coupled with basic deeplearning models, ensuing in better segmentation results. This paper suggests a novel mode selection method for extracting coarser information aiding in efficient segmentation. Also, the paper proposes a solution for the problem of fewer training samples through VMD. The proposed methodology was experimented on images of three publicly available datasets: Dubai, Northwestern Polytechnical University-Very High Resolution (NWPUVHR10), and Aerial Image Datasets (AID). Based on the quantitative and qualitative analysis of the results on the Dubai dataset, it is inferred that the performance of basic deep neural networks has improved when coupled with the VMD technique. Among the deep neural networks experimented on in this paper, VMD-based UNet with Vision Transformer outperformed the other models in terms of visual comparison, mean Intersection over Union (mIoU), and mean F1 score.
引用
收藏
页数:23
相关论文
共 99 条
[1]   An ensemble architecture of deep convolutional Segnet and Unet networks for building semantic segmentation from high-resolution aerial images [J].
Abdollahi, Abolfazl ;
Pradhan, Biswajeet ;
Alamri, Abdullah M. .
GEOCARTO INTERNATIONAL, 2022, 37 (12) :3355-3370
[2]   A Multivariate Empirical Mode Decomposition Based Approach to Pansharpening [J].
Abdullah, Syed Muhammad Umer ;
Rehman, Naveed Ur ;
Khan, Muhammad Murtaza ;
Mandic, Danilo P. .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2015, 53 (07) :3974-3984
[3]   Segmentation of remotely sensed images using wavelet features and their evaluation in soft computing framework [J].
Acharyya, M ;
De, RK ;
Kundu, MK .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2003, 41 (12) :2900-2905
[4]   A real-time efficient object segmentation system based on U-Net using aerial drone images [J].
Ahmed, Imran ;
Ahmad, Misbah ;
Jeon, Gwanggil .
JOURNAL OF REAL-TIME IMAGE PROCESSING, 2021, 18 (05) :1745-1758
[5]   Semantic Segmentation of High-Resolution Airborne Images with Dual-Stream DeepLabV3+ [J].
Akcay, Ozgun ;
Kinaci, Ahmet Cumhur ;
Avsar, Emin Ozgur ;
Aydar, Umut .
ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2022, 11 (01)
[6]   RETRACTED: An improved beluga whale optimizer-Derived Adaptive multi-channel DeepLabv3+for semantic segmentation of aerial images (Retracted article. See vol. 20, 2025) [J].
Anilkumar, P. ;
Venugopal, P. .
PLOS ONE, 2023, 18 (10) :e0290624
[7]   An Enhanced Multi-Objective-Derived Adaptive DeepLabv3 Using G-RDA for Semantic Segmentation of Aerial Images [J].
Anilkumar, P. ;
Venugopal, P. .
ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2023, 48 (08) :10745-10769
[8]   Aerial LaneNet: Lane-Marking Semantic Segmentation in Aerial Imagery Using Wavelet-Enhanced Cost-Sensitive Symmetric Fully Convolutional Neural Networks [J].
Azimi, Seyed Majid ;
Fischer, Peter ;
Koerner, Marco ;
Reinartz, Peter .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2019, 57 (05) :2920-2938
[9]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[10]  
Bhattacharjee A.D., 2023, INT C ADV DAT DRIV C, V21, P175