Recurrent Thrifty Attention Network for Remote Sensing Scene Recognition

被引：74

作者：

Fu, Liyong ^{[1
,2
]}

Zhang, Dong ^{[3
]}

Ye, Qiaolin ^{[4
]}

机构：

[1] Nanjing Forestry Univ, Coll Informat Sci & Technol, Nanjing 210037, Peoples R China

[2] Chinese Acad Forestry, Inst Forest Resource Informat Tech, Beijing 100091, Peoples R China

[3] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China

[4] Nanjing Forestry Univ, Sch Informat Sci & Technol, Nanjing 210037, Peoples R China

来源：

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2021年 / 59卷 / 10期

基金：

美国国家科学基金会;

关键词：

Attention learning; convolutional neural networks (CNNs); object detection; remote sensing scene (RSS) classification; RSS recognition; OBJECT TRACKING;

D O I：

10.1109/TGRS.2020.3042507

中图分类号：

P3 [地球物理学]; P59 [地球化学];

学科分类号：

0708 ; 070902 ;

摘要：

The self-attention mechanism has been empirically shown its effectiveness in a wide range of computer vision applications. However, it is usually criticized for the expensive computation cost. Although some revised methods are proposed in the recent past, they are not maturely applicable to remote sensing scene (RSS) images. To address this problem, in this article, we propose a simple yet effective context acquisition module, named thrifty attention, which can capture the long-range dependence efficiently and effectively. Moreover, a recurrent version for thrifty attention, termed recurrent thrifty attention (RTA), is further proposed to take the long-range multihop communications in space-time for RSS images. RTA is a general global contextual information acquisition module that can be used in any hierarchy of deep convolutional neural networks. To demonstrate its superiority, we deploy it to the classical ResNet and establish our proposed RTA Network (RTANet). Extensive experiments are carried out on two levels of the RSS recognition tasks, i.e., the image-level RSS classification and the instance-level RSS object detection. Compared with the standard self-attention mechanism, RTA can reduce at most 0.43 M model parameters while increasing a slight of model floating-point operations per second (FLOPs). Furthermore, results on RSS classification and object detection further verify the accuracy superiority of RTANet.

引用

页码：8257 / 8268

页数：12

共 57 条

[1]

[Anonymous], 2007, P 6 ACM INT C IM VID

[2] Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery [J].

Azimi, Seyed Majid ;

Vig, Eleonora ;

Bahmanyar, Reza ;

Koerner, Marco ;

Reinartz, Peter .

COMPUTER VISION - ACCV 2018, PT III, 2019, 11363 :150-165

[3] Attention Augmented Convolutional Networks [J].

Bello, Irwan ;

Zoph, Barret ;

Vaswani, Ashish ;

Shlens, Jonathon ;

Le, Quoc V. .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :3285-3294

[4] Remote Sensing Scene Classification Using Convolutional Features and Deep Forest Classifier [J].

Boualleg, Yaakoub ;

Farah, Mohamed ;

Farah, Imed Riadh .

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2019, 16 (12) :1944-1948

[5] GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond [J].

Cao, Yue ;

Xu, Jiarui ;

Lin, Stephen ;

Wei, Fangyun ;

Hu, Han .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, :1971-1980

[6] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].

Chen, Liang-Chieh ;

Papandreou, George ;

Kokkinos, Iasonas ;

Murphy, Kevin ;

Yuille, Alan L. .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848

[7] When Deep Learning Meets Metric Learning: Remote Sensing Image Scene Classification via Learning Discriminative CNNs [J].

Cheng, Gong ;

Yang, Ceyuan ;

Yao, Xiwen ;

Guo, Lei ;

Han, Junwei .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2018, 56 (05) :2811-2821

[8] Remote Sensing Image Scene Classification: Benchmark and State of the Art [J].

Cheng, Gong ;

Han, Junwei ;

Lu, Xiaoqiang .

PROCEEDINGS OF THE IEEE, 2017, 105 (10) :1865-1883

[9] Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images [J].

Cheng, Gong ;

Zhou, Peicheng ;

Han, Junwei .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2016, 54 (12) :7405-7415

[10] Multi-class geospatial object detection and geographic image classification based on collection of part detectors [J].

Cheng, Gong ;

Han, Junwei ;

Zhou, Peicheng ;

Guo, Lei .

ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2014, 98 :119-132

← 1 2 3 4 5 6 →