LSNet: Lightweight Spatial Boosting Network for Detecting Salient Objects in RGB-Thermal Images

被引：135

作者：

Zhou, Wujie ^{[1
,2
]}

Zhu, Yun ^{[1
,3
]}

Lei, Jingsheng ^{[1
,4
]}

Yang, Rongwang

Yu, Lu ^{[5
]}

机构：

[1] Zhejiang Univ Sci & Technol, Sch Informat & Elect Engn, Hangzhou 310023, Peoples R China

[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 639798, Singapore

[3] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China

[4] Zhejiang Univ, Childrens Hosp, Sch Med, Hangzhou 310030, Peoples R China

[5] Zhejiang Univ, Inst Informat & Commun Engn, Hangzhou 310027, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2023年 / 32卷

基金：

中国国家自然科学基金;

关键词：

Transfer learning; Boosting; Semantics; Prediction algorithms; Mobile handsets; Manifolds; Graphics processing units; Boundary boosting algorithm; transfer learning; RGB-thermal information; efficient salient object detection; MODEL; FUSION;

D O I：

10.1109/TIP.2023.3242775

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Most recent methods for RGB (red-green-blue)-thermal salient object detection (SOD) involve several floating-point operations and have numerous parameters, resulting in slow inference, especially on common processors, and impeding their deployment on mobile devices for practical applications. To address these problems, we propose a lightweight spatial boosting network (LSNet) for efficient RGB-thermal SOD with a lightweight MobileNetV2 backbone to replace a conventional backbone (e.g., VGG, ResNet). To improve feature extraction using a lightweight backbone, we propose a boundary boosting algorithm that optimizes the predicted saliency maps and reduces information collapse in low-dimensional features. The algorithm generates boundary maps based on predicted saliency maps without incurring additional calculations or complexity. As multimodality processing is essential for high-performance SOD, we adopt attentive feature distillation and selection and propose semantic and geometric transfer learning to enhance the backbone without increasing the complexity during testing. Experimental results demonstrate that the proposed LSNet achieves state-of-the-art performance compared with 14 RGB-thermal SOD methods on three datasets while improving the numbers of floating-point operations (1.025G) and parameters (5.39M), model size (22.1 MB), and inference speed (9.95 fps for PyTorch, batch size of 1, and Intel i5-7500 processor; 93.53 fps for PyTorch, batch size of 1, and NVIDIA TITAN V graphics processor; 936.68 fps for PyTorch, batch size of 20, and graphics processor; 538.01 fps for TensorRT and batch size of 1; and 903.01 fps for TensorRT/FP16 and batch size of 1).

引用

页码：1329 / 1340

页数：12

共 77 条

[1] DHFNet: dual-decoding hierarchical fusion network for RGB-thermal semantic segmentation [J].