A Spatial Hierarchical Reasoning Network for Remote Sensing Visual Question Answering

被引：1

作者：

Zhang, Zixiao ^{[1
,2
]}

Jiao, Licheng ^{[1
,2
]}

Li, Lingling ^{[1
,2
]}

Liu, Xu ^{[1
,2
]}

Chen, Puhua ^{[1
,2
]}

Liu, Fang ^{[1
,2
]}

Li, Yuxuan ^{[1
,2
]}

Guo, Zhicheng ^{[1
,2
]}

机构：

[1] Xidian Univ, Key Lab Intelligent Percept & Image Understanding, Int Res Ctr Intelligent Percept & Computat, Minist Educ,Joint Int Res Lab Intelligent Percept, Xian 710071, Shaanxi, Peoples R China

[2] Xidian Univ, Sch Artificial Intelligence, Xian 710071, Shaanxi, Peoples R China

来源：

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2023年 / 61卷

基金：

中国国家自然科学基金;

关键词：

Visualization; Remote sensing; Cognition; Task analysis; Geospatial analysis; Semantics; Question answering (information retrieval); Attention mechanism; multiscale representation; relational reasoning; visual question answering on remote sensing (RSVQA);

D O I：

10.1109/TGRS.2023.3237606

中图分类号：

P3 [地球物理学]; P59 [地球化学];

学科分类号：

0708 ; 070902 ;

摘要：

For visual question answering on remote sensing (RSVQA), current methods scarcely consider geospatial objects typically with large-scale differences and positional sensitive properties. Besides, modeling and reasoning the relationships between entities have rarely been explored, which leads to one-sided and inaccurate answer predictions. In this article, a novel method called spatial hierarchical reasoning network (SHRNet) is proposed, which endows a remote sensing (RS) visual question answering (VQA) system with enhanced visual-spatial reasoning capability. Specifically, a hash-based spatial multiscale visual representation module is first designed to encode multiscale visual features embedded with spatial positional information. Then, spatial hierarchical reasoning is conducted to learn the high-order inner group object relations across multiple scales under the guidance of linguistic cues. Finally, a visual-question (VQ) interaction module is employed to learn an effective image-text joint embedding for the final answer predicting. Experimental results on three public RS VQA datasets confirm the effectiveness and superiority of our model SHRNet.

引用

页数：15

共 63 条

[51]

Yuan ZH, 2022, Arxiv, DOI arXiv:2112.06343

[52] Weakly Supervised Learning Based on Coupled Convolutional Neural Networks for Aircraft Detection [J].

Zhang, Fan ;

Du, Bo ;

Zhang, Liangpei ;

Xu, Miaozhong .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2016, 54 (09) :5553-5563

[53] Saliency-Guided Unsupervised Feature Learning for Scene Classification [J].

Zhang, Fan ;

Du, Bo ;

Zhang, Liangpei .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2015, 53 (04) :2175-2184

[54] Laplacian Feature Pyramid Network for Object Detection in VHR Optical Remote Sensing Images [J].

Zhang, Wenhua ;

Jiao, Licheng ;

Li, Yuxuan ;

Huang, Zhongjian ;

Wang, Haoran .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60

[55] Hierarchical and Robust Convolutional Neural Network for Very High-Resolution Remote Sensing Object Detection [J].

Zhang, Yuanlin ;

Yuan, Yuan ;

Feng, Yachuang ;

Lu, Xiaoqiang .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2019, 57 (08) :5535-5548

[56] A 3-D Storm Motion Estimation Method Based on Point Cloud Learning and Doppler Weather Radar Data [J].

Zhang, Zhuoyu ;

He, Zhenghao ;

Yang, Jin ;

Liu, Yuchen ;

Bao, Riyang ;

Gao, Shuping .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60

[57] High-Resolution Remote Sensing Image Captioning Based on Structured Attention [J].

Zhao, Rui ;

Shi, Zhenwei ;

Zou, Zhengxia .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60

[58] Development of a Gray-Level Co-Occurrence Matrix-Based Texture Orientation Estimation Method and Its Application in Sea Surface Wind Direction Retrieval From SAR Imagery [J].

Zheng, Gang ;

Li, Xiaofeng ;

Zhou, Lizhang ;

Yang, Jingsong ;

Ren, Lin ;

Chen, Peng ;

Zhang, Huaguo ;

Lou, Xiulin .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2018, 56 (09) :5244-5260

[59]

Zheng XT, 2022, IEEE T GEOSCI REMOTE, V60, DOI [10.1109/TGRS.2021.3079918, 10.1109/TGRS.2021.3116147]

[60]

Zhou BL, 2015, Arxiv, DOI arXiv:1512.02167

← 1 2 3 4 5 6 7 →