Multimodal Remote Sensing Image Segmentation With Intuition-Inspired Hypergraph Modeling

被引:42
作者
He, Qibin [1 ,2 ]
Sun, Xian [1 ,2 ]
Diao, Wenhui [1 ,3 ]
Yan, Zhiyuan [1 ,3 ]
Yao, Fanglong [1 ,2 ]
Fu, Kun [1 ,2 ]
机构
[1] Chinese Acad Sci, Aerosp Informat Res Inst, Key Lab Network Informat Syst Technol NIST, Beijing 1100190, Peoples R China
[2] Univ Chinese Acad Sci, Sch Elect Elect & Commun Engn, Beijing 100190, Peoples R China
[3] Chinese Acad Sci, Inst Elect, Key Lab Network Informat Syst Technol NIST, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Cognition; Semantics; Image segmentation; Remote sensing; Optical interferometry; Vegetation; Optical sensors; Multimodal remote sensing; intuitive reasoning; hypergraph learning; semantic segmentation; SEMANTIC SEGMENTATION; FUSION NETWORK; ATTENTION;
D O I
10.1109/TIP.2023.3245324
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multimodal remote sensing (RS) image segmentation aims to comprehensively utilize multiple RS modalities to assign pixel-level semantics to the studied scenes, which can provide a new perspective for global city understanding. Multimodal segmentation inevitably encounters the challenge of modeling intra-and inter-modal relationships, i.e., object diversity and modal gaps. However, the previous methods are usually designed for a single RS modality, limited by the noisy collection environment and poor discrimination information. Neuropsychology and neuroanatomy confirm that the human brain performs the guiding perception and integrative cognition of multimodal semantics through intuitive reasoning. Therefore, establishing a semantic understanding framework inspired by intuition to realize multimodal RS segmentation becomes the main motivation of this work. Drived by the superiority of hypergraphs in modeling high-order relationships, we propose an intuition-inspired hypergraph network ((IH)-H-2 N) for multimodal RS segmentation. Specifically, we present a hypergraph parser to imitate guiding perception to learn intra-modal object-wise relationships. It parses the input modality into irregular hyper graphs to mine semantic clues and generate robust mono modal representations. In addition, we also design a hypergraph matcher to dynamically update the hypergraph structure from the explicit correspondence of visual concepts, similar to integrative cognition, to improve cross-modal compatibility when fusing multimodal features. Extensive experiments on two multimodal RS datasets show that the proposed I2H N outperforms the stateof-the-art models, achieving F-1/mIoU accuracy 91.4%/82.9% on the ISPRS Vaihingen dataset, and 92.1%/84.2% on the MSAW dataset.
引用
收藏
页码:1474 / 1487
页数:14
相关论文
共 59 条
[1]   Beyond RGB: Very high resolution urban remote sensing with multimodal deep networks [J].
Audebert, Nicolas ;
Le Saux, Bertrand ;
Lefevre, Sebastien .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2018, 140 :20-32
[2]   Joint Learning from Earth Observation and OpenStreetMap Data to Get Faster Better Semantic Maps [J].
Audebert, Nicolas ;
Le Saux, Bertrand ;
Lefevre, Sebastien .
2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, :1552-1560
[3]   Intuition, reason and faith in science [J].
Birkhoff, GD .
NATURE, 1939, 143 :60-67
[4]   INTUITION IN THE CONTEXT OF DISCOVERY [J].
BOWERS, KS ;
REGEHR, G ;
BALTHAZARD, C ;
PARKER, K .
COGNITIVE PSYCHOLOGY, 1990, 22 (01) :72-110
[5]   Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].
Chen, Liang-Chieh ;
Zhu, Yukun ;
Papandreou, George ;
Schroff, Florian ;
Adam, Hartwig .
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851
[6]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[7]   Deep Feature Extraction and Classification of Hyperspectral Images Based on Convolutional Neural Networks [J].
Chen, Yushi ;
Jiang, Hanlu ;
Li, Chunyang ;
Jia, Xiuping ;
Ghamisi, Pedram .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2016, 54 (10) :6232-6251
[8]   Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images [J].
Cheng, Gong ;
Zhou, Peicheng ;
Han, Junwei .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2016, 54 (12) :7405-7415
[9]  
Ding KZ, 2020, Arxiv, DOI arXiv:2011.00387
[10]   Continuity in intuition and insight: from real to naturalistic virtual environment [J].
Eskinazi, M. ;
Giannopulu, I. .
SCIENTIFIC REPORTS, 2021, 11 (01)