More Diverse Means Better: Multimodal Deep Learning Meets Remote-Sensing Imagery Classification

被引:985
作者
Hong, Danfeng [1 ]
Gao, Lianru [2 ]
Yokoya, Naoto [3 ,4 ]
Yao, Jing [5 ]
Chanussot, Jocelyn [6 ,7 ]
Du, Qian [8 ]
Zhang, Bing [2 ,9 ]
机构
[1] Univ Grenoble Alpes, CNRS, Grenoble INP, GIPSA Lab, F-38000 Grenoble, France
[2] Chinese Acad Sci, Aerosp Informat Res Inst, Key Lab Digital Earth Sci, Beijing 100094, Peoples R China
[3] Univ Tokyo, Grad Sch Frontier Sci, Chiba 2778561, Japan
[4] RIKEN, RIKEN Ctr Adv Intelligence Project AIP, Geoinformat Unit, Tokyo 1030027, Japan
[5] Xi An Jiao Tong Univ, Sch Math & Stat, Xian 710049, Peoples R China
[6] Univ Grenoble Alpes, INRIA, CNRS, Grenoble INP,LJK, F-38000 Grenoble, France
[7] Chinese Acad Sci, Aerosp Informat Res Inst, Beijing 100094, Peoples R China
[8] Mississippi State Univ, Dept Elect & Comp Engn, Starkville, MS 39762 USA
[9] Univ Chinese Acad Sci, Coll Resources & Environm, Beijing 100049, Peoples R China
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2021年 / 59卷 / 05期
基金
日本学术振兴会; 中国国家自然科学基金;
关键词
Classification; convolutional neural networks (CNNS); cross modality; deep learning (DL); feature learning; fusion; hyperspectral; light detection and ranging (LiDAR); multimodal; multispectral; network architecture; remote sensing (RS); synthetic aperture radar (SAR); CONVOLUTIONAL NEURAL-NETWORK; LAND-COVER; DATA FUSION; LIDAR DATA; MANIFOLD ALIGNMENT; FRAMEWORK;
D O I
10.1109/TGRS.2020.3016820
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Classification and identification of the materials lying over or beneath the earth's surface have long been a fundamental but challenging research topic in geoscience and remote sensing (RS), and have garnered a growing concern owing to the recent advancements of deep learning techniques. Although deep networks have been successfully applied in single-modality-dominated classification tasks, yet their performance inevitably meets the bottleneck in complex scenes that need to be finely classified, due to the limitation of information diversity. In this work, we provide a baseline solution to the aforementioned difficulty by developing a general multimodal deep learning (MDL) framework. In particular, we also investigate a special case of multi-modality learning (MML)-cross-modality learning (CML) that exists widely in RS image classification applications. By focusing on "what," "where," and "how" to fuse, we show different fusion strategies as well as how to train deep networks and build the network architecture. Specifically, five fusion architectures are introduced and developed, further being unified in our MDL framework. More significantly, our framework is not only limited to pixel-wise classification tasks but also applicable to spatial information modeling with convolutional neural networks (CNNs). To validate the effectiveness and superiority of the MDL framework, extensive experiments related to the settings of MML and CML are conducted on two different multimodal RS data sets. Furthermore, the codes and data sets will be available at https://github.com/danfenghong/IEEE_TGRS_MDLRS, contributing to the RS community.
引用
收藏
页码:4340 / 4354
页数:15
相关论文
共 55 条
  • [1] Beyond RGB: Very high resolution urban remote sensing with multimodal deep networks
    Audebert, Nicolas
    Le Saux, Bertrand
    Lefevre, Sebastien
    [J]. ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2018, 140 : 20 - 32
  • [2] Semantic Segmentation of Earth Observation Data Using Multimodal and Multi-scale Deep Networks
    Audebert, Nicolas
    Le Saux, Bertrand
    Lefevre, Sebastien
    [J]. COMPUTER VISION - ACCV 2016, PT I, 2017, 10111 : 180 - 196
  • [3] M3Fusion: A Deep Learning Architecture for Multiscale Multimodal Multitemporal Satellite Data Fusion
    Benedetti, Paola
    Ienco, Dino
    Gaetano, Raffaele
    Ose, Kenji
    Pensa, Ruggero G.
    Dupuy, Stephane
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2018, 11 (12) : 4939 - 4949
  • [4] CodeSLAM-Learning a Compact, Optimisable Representation for Dense Visual SLAM
    Bloesch, Michael
    Czarnowski, Jan
    Clark, Ronald
    Leutenegger, Stefan
    Davison, Andrew J.
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 2560 - 2568
  • [5] An Enhanced 3-D Discrete Wavelet Transform for Hyperspectral Image Classification
    Cao, Xiangyong
    Yao, Jing
    Fu, Xueyang
    Bi, Haixia
    Hong, Danfeng
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2021, 18 (06) : 1104 - 1108
  • [6] Hyperspectral Image Classification With Convolutional Neural Network and Active Learning
    Cao, Xiangyong
    Yao, Jing
    Xu, Zongben
    Meng, Deyu
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2020, 58 (07): : 4604 - 4616
  • [7] Classification of remote sensing images from urban areas using a fuzzy possibilistic model
    Chanussot, J
    Benediktsson, JA
    Fauvel, M
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2006, 3 (01) : 40 - 44
  • [8] Multi-source remotely sensed data fusion for improving land cover classification
    Chen, Bin
    Huang, Bo
    Xu, Bing
    [J]. ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2017, 124 : 27 - 39
  • [9] Deep Fusion of Remote Sensing Data for Accurate Classification
    Chen, Yushi
    Li, Chunyang
    Ghamisi, Pedram
    Jia, Xiuping
    Gu, Yanfeng
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2017, 14 (08) : 1253 - 1257
  • [10] A COEFFICIENT OF AGREEMENT FOR NOMINAL SCALES
    COHEN, J
    [J]. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1960, 20 (01) : 37 - 46