On the Co-Selection of Vision Transformer Features and Images for Very High-Resolution Image Scene Classification

被引:11
作者
Chaib, Souleyman [1 ]
Mansouri, Dou El Kefel [2 ]
Omara, Ibrahim [3 ]
Hagag, Ahmed [4 ]
Dhelim, Sahraoui [5 ]
Bensaber, Djamel Amar [1 ]
机构
[1] Ecole Super Informat, LabRi Lab, Sidi Bel Abbes 22000, Algeria
[2] Ibn Khaldoun Univ, Fac Sci Nat & Life, Tiaret 14000, Algeria
[3] Menoufia Univ, Fac Artificial Intelligence, Dept Machine Intelligence, Shibin Al Kawm 32511, Egypt
[4] Benha Univ, Fac Comp & Artificial Intelligence, Dept Sci Comp, Banha 13518, Egypt
[5] Univ Coll Dublin, Sch Comp Sci, Dublin D04 V1W8, Ireland
关键词
very high-resolution images (VHRI); vision transformer (ViT); image scene classification; deep features; CONVOLUTIONAL NEURAL-NETWORKS; REMOTE; ATTENTION; SCALE;
D O I
10.3390/rs14225817
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Recent developments in remote sensing technology have allowed us to observe the Earth with very high-resolution (VHR) images. VHR imagery scene classification is a challenging problem in the field of remote sensing. Vision transformer (ViT) models have achieved breakthrough results in image recognition tasks. However, transformer-encoder layers encode different levels of features, where the latest layer represents semantic information, in contrast to the earliest layers, which contain more detailed data but ignore the semantic information of an image scene. In this paper, a new deep framework is proposed for VHR scene understanding by exploring the strengths of ViT features in a simple and effective way. First, pre-trained ViT models are used to extract informative features from the original VHR image scene, where the transformer-encoder layers are used to generate the feature descriptors of the input images. Second, we merged the obtained features as one signal data set. Third, some extracted ViT features do not describe well the image scenes, such as agriculture, meadows, and beaches, which could negatively affect the performance of the classification model. To deal with this challenge, we propose a new algorithm for feature- and image selection. Indeed, this gives us the possibility of eliminating the less important features and images, as well as those that are abnormal; based on the similarity of preserving the whole data set, we selected the most informative features and important images by dropping the irrelevant images that degraded the classification accuracy. The proposed method was tested on three VHR benchmarks. The experimental results demonstrate that the proposed method outperforms other state-of-the-art methods.
引用
收藏
页数:19
相关论文
共 54 条
  • [1] Binary patterns encoded convolutional neural networks for texture recognition and remote sensing scene classification
    Anwer, Rao Muhammad
    Khan, Fahad Shahbaz
    van de Weijer, Joost
    Molinier, Matthieu
    Laaksonen, Jorma
    [J]. ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2018, 138 : 74 - 85
  • [2] Block-based semantic classification of high-resolution multispectral aerial images
    Avramovic, Aleksej
    Risojevic, Vladimir
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2016, 10 (01) : 75 - 84
  • [3] sCOs: Semi-Supervised Co-Selection by a Similarity Preserving Approach
    Benabdeslem, Khalid
    Mansouri, Dou El Kefel
    Makkhongkaew, Raywat
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (06) : 2899 - 2911
  • [4] Castelluccio M, 2015, ARXIV
  • [5] Deep Feature Fusion for VHR Remote Sensing Scene Classification
    Chaib, Souleyman
    Liu, Huan
    Gu, Yanfeng
    Yao, Hongxun
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2017, 55 (08): : 4775 - 4784
  • [6] LIBSVM: A Library for Support Vector Machines
    Chang, Chih-Chung
    Lin, Chih-Jen
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
  • [7] Measuring the Effectiveness of Various Features for Thematic Information Extraction From Very High Resolution Remote Sensing Imagery
    Chen, Xi
    Fang, Tao
    Huo, Hong
    Li, Deren
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2015, 53 (09): : 4837 - 4851
  • [8] Remote Sensing Image Scene Classification: Benchmark and State of the Art
    Cheng, Gong
    Han, Junwei
    Lu, Xiaoqiang
    [J]. PROCEEDINGS OF THE IEEE, 2017, 105 (10) : 1865 - 1883
  • [9] Effective and Efficient Midlevel Visual Elements-Oriented Land-Use Classification Using VHR Remote Sensing Images
    Cheng, Gong
    Han, Junwei
    Guo, Lei
    Liu, Zhenbao
    Bu, Shuhui
    Ren, Jinchang
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2015, 53 (08): : 4238 - 4249
  • [10] Multi-class geospatial object detection and geographic image classification based on collection of part detectors
    Cheng, Gong
    Han, Junwei
    Zhou, Peicheng
    Guo, Lei
    [J]. ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2014, 98 : 119 - 132