AAFormer: Attention-Attended Transformer for Semantic Segmentation of Remote Sensing Images

被引:38
作者
Li, Xin [1 ,2 ]
Xu, Feng [1 ,2 ,3 ]
Li, Linyang [4 ,5 ]
Xu, Nan [6 ]
Liu, Fan [1 ,2 ]
Yuan, Chi [1 ,2 ]
Chen, Ziqi [7 ]
Lyu, Xin [1 ,2 ]
机构
[1] Hohai Univ, Coll Comp & Informat, Nanjing 211100, Peoples R China
[2] Hohai Univ, Minist Water Resources, Key Lab Water Big Data Technol, Nanjing 211100, Peoples R China
[3] Jiangsu Ocean Univ, Sch Comp Engn, Lianyungang 222005, Peoples R China
[4] Informat Engn Univ, Surveying & Mapping Inst, Zhengzhou 450001, Peoples R China
[5] Wuhan Univ, Sch Geodesy & Geomat, Wuhan 430079, Peoples R China
[6] Hohai Univ, Coll Geog & Remote Sensing, Nanjing 211100, Peoples R China
[7] Tsinghua Univ, Dept Earth Syst Sci, Beijing 100084, Peoples R China
关键词
Active appearance model; Transformers; Semantic segmentation; Remote sensing; Semantics; Decoding; Merging; High-resolution remote sensing images (RSIs); local and global contexts; semantic segmentation; transformer; NETWORK;
D O I
10.1109/LGRS.2024.3397851
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
The rapid advancements in remote sensing technology have enabled the widespread availability of fine-resolution remote sensing images (RSIs), offering rich spatial details and semantics. Despite the applicability and scalability of transformers in semantic segmentation of RSIs by learning pairwise contextual affinity, they inevitably introduce irrelevant context, hindering accurate inference of patch semantics. To address this, we propose a novel multihead attention-attended module (AAM) that refines the multihead self-attention mechanism (AM). The AAM filters out irrelevant context while highlighting informative ones by considering the relevance between self-attention maps and the query vector. The AAM generates an attention gate to complement contextual affinity and emphasize the useful ones with a higher weight simultaneously. Leveraging multihead AAM as the core unit, we construct a lightweight attention-attended transformer block (ATB). Subsequently, we devise AAFormer, a pure transformer with a mask transformer decoder, for achieving semantic segmentation of RSIs. We extensively evaluate our approach on the ISPRS Potsdam and LoveDA datasets, demonstrating compelling performance compared to mainstream methods. Additionally, we conduct evaluations to analyze the effects of AAM.
引用
收藏
页码:1 / 5
页数:5
相关论文
共 18 条
[1]   LANet: Local Attention Embedding to Improve the Semantic Segmentation of Remote Sensing Images [J].
Ding, Lei ;
Tang, Hao ;
Bruzzone, Lorenzo .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2021, 59 (01) :426-435
[2]  
ISPRS WG III/4, Semantic Labeling Contest
[3]  
Kitaev N, 2020, arXiv, DOI DOI 10.48550/ARXIV.2001.04451
[4]   SCAttNet: Semantic Segmentation Network With Spatial and Channel Attention Mechanism for High-Resolution Remote Sensing Images [J].
Li, Haifeng ;
Qiu, Kaijian ;
Chen, Li ;
Mei, Xiaoming ;
Hong, Liang ;
Tao, Chao .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2021, 18 (05) :905-909
[5]   A2-FPN for semantic segmentation of fine-resolution remotely sensed images [J].
Li, Rui ;
Wang, Libo ;
Zhang, Ce ;
Duan, Chenxi ;
Zheng, Shunyi .
INTERNATIONAL JOURNAL OF REMOTE SENSING, 2022, 43 (03) :1131-1155
[6]   A Spectral-Spatial Context-Boosted Network for Semantic Segmentation of Remote Sensing Images [J].
Li, Xin ;
Yong, Xi ;
Li, Tao ;
Tong, Yao ;
Gao, Hongmin ;
Wang, Xinyuan ;
Xu, Zhennan ;
Fang, Yiwei ;
You, Qian ;
Lyu, Xin .
REMOTE SENSING, 2024, 16 (07)
[7]   Semantic Segmentation of Remote Sensing Images by Interactive Representation Refinement and Geometric Prior-Guided Inference [J].
Li, Xin ;
Xu, Feng ;
Liu, Fan ;
Tong, Yao ;
Lyu, Xin ;
Zhou, Jun .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 :1-18
[8]   SSCNet: A Spectrum-Space Collaborative Network for Semantic Segmentation of Remote Sensing Images [J].
Li, Xin ;
Xu, Feng ;
Yong, Xi ;
Chen, Deqing ;
Xia, Runliang ;
Ye, Baoliu ;
Gao, Hongmin ;
Chen, Ziqi ;
Lyu, Xin .
REMOTE SENSING, 2023, 15 (23)
[9]   Encoding Contextual Information by Interlacing Transformer and Convolution for Remote Sensing Imagery Semantic Segmentation [J].
Li, Xin ;
Xu, Feng ;
Xia, Runliang ;
Li, Tao ;
Chen, Ziqi ;
Wang, Xinyuan ;
Xu, Zhennan ;
Lyu, Xin .
REMOTE SENSING, 2022, 14 (16)
[10]   Dual attention deep fusion semantic segmentation networks of large-scale satellite remote-sensing images [J].
Li, Xin ;
Xu, Feng ;
Lyu, Xin ;
Gao, Hongmin ;
Tong, Yao ;
Cai, Sujin ;
Li, Shengyang ;
Liu, Daofang .
INTERNATIONAL JOURNAL OF REMOTE SENSING, 2021, 42 (09) :3583-3610