Deep Hierarchical Vision Transformer for Hyperspectral and LiDAR Data Classification

被引:151
作者
Xue, Zhixiang [1 ]
Tan, Xiong [1 ]
Yu, Xuchu [1 ]
Liu, Bing [1 ]
Yu, Anzhu [1 ]
Zhang, Pengqiang [1 ]
机构
[1] PLA Strateg Support Force Informat Engn Univ, Zhengzhou 450001, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Transformers; Hyperspectral imaging; Laser radar; Data mining; Collaboration; Data models; Hyperspectral image; light detection and ranging; joint classification; vision transformer; convolutional vision transformer; cross attention fusion; REMOTE-SENSING DATA; DATA FUSION; LAND-COVER; IMAGE CLASSIFICATION; EXTINCTION PROFILES; FEATURE-EXTRACTION;
D O I
10.1109/TIP.2022.3162964
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this study, we develop a novel deep hierarchical vision transformer (DHViT) architecture for hyperspectral and light detection and ranging (LiDAR) data joint classification. Current classification methods have limitations in heterogeneous feature representation and information fusion of multi-modality remote sensing data (e.g., hyperspectral and LiDAR data), these shortcomings restrict the collaborative classification accuracy of remote sensing data. The proposed deep hierarchical vision transformer architecture utilizes both the powerful modeling capability of long-range dependencies and strong generalization ability across different domains of the transformer network, which is based exclusively on the self-attention mechanism. Specifically, the spectral sequence transformer is exploited to handle the long-range dependencies along the spectral dimension from hyperspectral images, because all diagnostic spectral bands contribute to the land cover classification. Thereafter, we utilize the spatial hierarchical transformer structure to extract hierarchical spatial features from hyperspectral and LiDAR data, which are also crucial for classification. Furthermore, the cross attention (CA) feature fusion pattern could adaptively and dynamically fuse heterogeneous features from multi-modality data, and this contextual aware fusion mode further improves the collaborative classification performance. Comparative experiments and ablation studies are conducted on three benchmark hyperspectral and LiDAR datasets, and the DHViT model could yield an average overall classification accuracy of 99.58%, 99.55%, and 96.40% on three datasets, respectively, which sufficiently certify the effectiveness and superior performance of the proposed method.
引用
收藏
页码:3095 / 3110
页数:16
相关论文
共 48 条
[1]   Vision Transformers for Remote Sensing Image Classification [J].
Bazi, Yakoub ;
Bashmal, Laila ;
Rahhal, Mohamad M. Al ;
Dayil, Reham Al ;
Ajlan, Naif Al .
REMOTE SENSING, 2021, 13 (03) :1-20
[2]  
Chen C.-F.R., 2021, P IEEECVF INT C COMP, P357
[3]   Deep Fusion of Remote Sensing Data for Accurate Classification [J].
Chen, Yushi ;
Li, Chunyang ;
Ghamisi, Pedram ;
Jia, Xiuping ;
Gu, Yanfeng .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2017, 14 (08) :1253-1257
[4]   Hyperspectral and LiDAR Data Fusion: Outcome of the 2013 GRSS Data Fusion Contest [J].
Debes, Christian ;
Merentitis, Andreas ;
Heremans, Roel ;
Hahn, Juergen ;
Frangiadakis, Nikolaos ;
van Kasteren, Tim ;
Liao, Wenzhi ;
Bellens, Rik ;
Pizurica, Aleksandra ;
Gautama, Sidharta ;
Philips, Wilfried ;
Prasad, Saurabh ;
Du, Qian ;
Pacifici, Fabio .
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2014, 7 (06) :2405-2418
[5]   Multisource and Multitemporal Data Fusion in Remote Sensing A comprehensive review of the state of the art [J].
Ghamisi, Pedram ;
Rasti, Behnood ;
Yokoya, Naoto ;
Wang, Qunming ;
Hoefle, Bernhard ;
Bruzzone, Lorenzo ;
Bovolo, Francesca ;
Chi, Mingmin ;
Anders, Katharina ;
Gloaguen, Richard ;
Atkinson, Peter M. ;
Benediktsson, Jon Atli .
IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE, 2019, 7 (01) :6-39
[6]   Hyperspectral and LiDAR Data Fusion Using Extinction Profiles and Deep Convolutional Neural Network [J].
Ghamisi, Pedram ;
Hoefle, Bernhard ;
Zhu, Xiao Xiang .
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2017, 10 (06) :3011-3024
[7]   Land-cover classification using both hyperspectral and LiDAR data [J].
Ghamisi, Pedram ;
Benediktsson, Jon Atli ;
Phinn, Stuart .
INTERNATIONAL JOURNAL OF IMAGE AND DATA FUSION, 2015, 6 (03) :189-215
[8]   Classification of Hyperspectral and LiDAR Data Using Coupled CNNs [J].
Hang, Renlong ;
Li, Zhu ;
Ghamisi, Pedram ;
Hong, Danfeng ;
Xia, Guiyu ;
Liu, Qingshan .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2020, 58 (07) :4939-4950
[9]   Spatial-Spectral Transformer for Hyperspectral Image Classification [J].
He, Xin ;
Chen, Yushi ;
Lin, Zhouhan .
REMOTE SENSING, 2021, 13 (03) :1-22
[10]  
Hong D., IEEE GEOSCI REMOTE S, V19, P2022