Few-shot remote sensing image scene classification based on multiscale covariance metric network (MCMNet)

被引:15
作者
Chen, Xiliang [1 ]
Zhu, Guobin [1 ]
Liu, Mingqing [1 ]
Chen, Zhaotong [1 ]
机构
[1] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan 430079, Peoples R China
基金
中国国家自然科学基金;
关键词
FSL; Covariance network; Image scene recognition; Prototype;
D O I
10.1016/j.neunet.2023.04.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Few-shot learning (FSL) is a paradigm that simulates the fast learning ability of human beings, which can learn the feature differences between two groups of small-scale samples with common label space, and the label space of the training set and the test set is not repeated. By this way, it can quickly identify the categories of the unseen image in the test set. This method is widely used in image scene recognition, and it is expected to overcome difficulties of scarce annotated samples in remote sensing (RS). However, among most existing FSL methods, images were embed into Euclidean space, and the similarity between features at the last layer of deep network were measured by Euclidean distance. It is difficult to measure the inter-class similarity and intra-class difference of RS images. In this paper, we propose a multi-scale covariance network (MCMNet) for the application of remote sensing scene classification (RSSC). Taking Conv64F as the backbone, we mapped the features of the 1, 2, and 4 layers of the network to the manifold space by constructing a regional covariance matrix to form a covariance network with different scales. For each layer of features, we introduce the center in manifold space as a prototype for different categories of features. We simultaneously measure the similarity of three prototypes on the manifold space with different scales to form three loss functions and optimize the whole network by episodic training strategy. We conducted comparative experiments on three public datasets. The results show that the classification accuracy (CA) of our proposed method is from 1.35 % to 2.36% higher than that of the most excellent method, which demonstrates that the performance of MCMNet outperforms other methods.(c) 2023 Elsevier Ltd. All rights reserved.
引用
收藏
页码:132 / 145
页数:14
相关论文
共 77 条
[11]   Jensen-Bregman LogDet Divergence with Application to Efficient Similarity Search for Covariance Matrices [J].
Cherian, Anoop ;
Sra, Suvrit ;
Banerjee, Arindam ;
Papanikolopoulos, Nikolaos .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (09) :2161-2174
[12]  
Cherian A, 2011, IEEE I CONF COMP VIS, P2399, DOI 10.1109/ICCV.2011.6126523
[13]   Optimization Framework for Spatiotemporal Analysis Units Based on Floating Car Data [J].
Cui, Haifu ;
Wu, Liang ;
He, Zhenming .
REMOTE SENSING, 2022, 14 (10)
[14]   Kernel Pooling for Convolutional Neural Networks [J].
Cui, Yin ;
Zhou, Feng ;
Wang, Jiang ;
Liu, Xiao ;
Lin, Yuanqing ;
Belongie, Serge .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :3049-3058
[15]  
Cybenko G., 1989, Mathematics of Control, Signals, and Systems, V2, P303, DOI 10.1007/BF02551274
[16]  
Finn C, 2017, Arxiv, DOI [arXiv:1703.03400, DOI 10.48550/ARXIV.1703.03400, DOI 10.5555/3305381.3305498]
[17]  
Gao H, 2018, Arxiv, DOI [arXiv:1810.11730, 10.48550/arXiv.1810.11730]
[18]  
Garcia Victor., 2017, arXiv
[19]   Low-shot Visual Recognition by Shrinking and Hallucinating Features [J].
Hariharan, Bharath ;
Girshick, Ross .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :3037-3046
[20]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778