Satellite Component Semantic Segmentation: Video Dataset and Real-Time Pyramid Attention and Decoupled Attention Network

被引：4

作者：

Shao, Yadong ^{[1
,2
]}

Wu, Aodi ^{[2
]}

Li, Shengyang ^{[3
,4
]}

Shu, Leizheng ^{[3
,4
]}

Wan, Xue ^{[3
,4
]}

Shao, Yuanbin

Huo, Junyan ^{[5
]}

机构：

[1] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 100049, Peoples R China

[2] Univ Chinese Acad Sci, Sch Aeronaut & Astronaut, Beijing 100049, Peoples R China

[3] Chinese Acad Sci, Technol & Engn Ctr Space Utilizat, Beijing 100094, Peoples R China

[4] Chinese Acad Sci, Key Lab Space Utilizat, Beijing 100094, Peoples R China

[5] Univ Warwick, Dept Comp Sci, Coventry CV4 8UW, England

来源：

IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS | 2023年 / 59卷 / 06期

关键词：

Satellites; Semantic segmentation; Task analysis; Real-time systems; Streaming media; Solid modeling; Semantics;

D O I：

10.1109/TAES.2023.3282608

中图分类号：

V [航空、航天];

学科分类号：

08 ; 0825 ;

摘要：

High-accuracy and real-time satellite component semantic segmentation can locate the key satellite components, such as solar panels, to be operated in on-orbit services, which is of great significance for navigation and control. However, to accomplish the above aim, two main challenges remain unsolved. First, satellite component semantic segmentation algorithms require a large number of images for training; however, on-orbit satellite images are difficult to obtain, especially for a large-scale satellite component video dataset. In addition, high-accuracy semantic segmentation networks require relatively more computation resources, which are difficult to be fulfilled in on-orbit tasks. How to build a satellite component semantic segmentation network that meets the requirements of both high-accuracy and real-time on-orbit operation is the key aim to be accomplished in this article. In this article, a simulated satellite component dataset consisting of 98 video sequences of 13 satellites, with a complex background, various on-orbit illumination, and common satellite motion, is proposed, and it has 32 402 frames in total. To meet the requirements of both high-accuracy and real-time on-orbit operation, this article proposes an attention-based real-time network, Pyramid Attention and Decoupled Attention Network (PADAN), which contains an image-based version, PADAN-S, and a video-based version, PADAN-T. The PADAN-S, which mainly adopts pyramid attention calculation on three-layer pyramid features and then performs decoupled attention calculation by considering both row and column attention, is based on AttaNet. The PADAN-T uses a part of the PADAN-S to obtain temporal pyramid features from temporal frames, then performs decoupled attention calculations between the features of the output frame and the features at each layer in the temporal pyramid. The experimental results show that the PADAN-S and PADAN-T have superior performance compared to other real-time state-of-the-art algorithms in accuracy in both image-based and video-based satellite component semantic segmentation tasks on simulation datasets, and our dataset has a degree of simulating the real on-orbit environment. The PADAN-S can achieve a speed of 10.25 frames per second with an image solution of 1280 pixels x 720 pixels on the edge computing device Jetson Xavier, and the PADAN-T can obtain a speed of 7.18 frames per second.

引用

页码：7315 / 7333

页数：19

共 34 条

[1] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
Badrinarayanan, Vijay
Kendall, Alex
Cipolla, Roberto
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) : 2481 - 2495
[2] Bergen James R., 1984, RCA Engineer, V29, P33
[3] Deep Spatio-Temporal Random Fields for Efficient Video Segmentation
Chandra, Siddhartha
Couprie, Camille
Kokkinos, Iasonas
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 8915 - 8924
[4] Cheng HK, 2021, ADV NEUR IN, V34
[5] ECO: Efficient Convolution Operators for Tracking
Danelljan, Martin
Bhat, Goutam
Khan, Fahad Shahbaz
Felsberg, Michael
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6931 - 6939
[6] Ding MY, 2020, AAAI CONF ARTIF INTE, V34, P10713
[7] STFCN: Spatio-Temporal Fully Convolutional Neural Network for Semantic Segmentation of Street Scenes
Fayyaz, Mohsen
Saffar, Mohammad Hajizadeh
Sabokrou, Mohammad
Fathy, Mahmood
Huang, Fay
Klette, Reinhard
[J]. COMPUTER VISION - ACCV 2016 WORKSHOPS, PT I, 2017, 10116 : 493 - 509
[8] A review of space robotics technologies for on-orbit servicing
Flores-Abad, Angel
Ma, Ou
Pham, Khanh
Ulrich, Steve
[J]. PROGRESS IN AEROSPACE SCIENCES, 2014, 68 : 1 - 26
[9] Semantic Video CNNs through Representation Warping
Gadde, Raghudeep
Jampani, Varun
Gehler, Peter V.
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 4463 - 4472
[10] A Spacecraft Dataset for Detection, Segmentation and Parts Recognition
Hoang Anh Dung
Chen, Bo
Chin, Tat-Jun
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 2012 - 2019

← 1 2 3 4 →