LSNet: Lightweight Spatial Boosting Network for Detecting Salient Objects in RGB-Thermal Images

被引:135
作者
Zhou, Wujie [1 ,2 ]
Zhu, Yun [1 ,3 ]
Lei, Jingsheng [1 ,4 ]
Yang, Rongwang
Yu, Lu [5 ]
机构
[1] Zhejiang Univ Sci & Technol, Sch Informat & Elect Engn, Hangzhou 310023, Peoples R China
[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 639798, Singapore
[3] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China
[4] Zhejiang Univ, Childrens Hosp, Sch Med, Hangzhou 310030, Peoples R China
[5] Zhejiang Univ, Inst Informat & Commun Engn, Hangzhou 310027, Peoples R China
基金
中国国家自然科学基金;
关键词
Transfer learning; Boosting; Semantics; Prediction algorithms; Mobile handsets; Manifolds; Graphics processing units; Boundary boosting algorithm; transfer learning; RGB-thermal information; efficient salient object detection; MODEL; FUSION;
D O I
10.1109/TIP.2023.3242775
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most recent methods for RGB (red-green-blue)-thermal salient object detection (SOD) involve several floating-point operations and have numerous parameters, resulting in slow inference, especially on common processors, and impeding their deployment on mobile devices for practical applications. To address these problems, we propose a lightweight spatial boosting network (LSNet) for efficient RGB-thermal SOD with a lightweight MobileNetV2 backbone to replace a conventional backbone (e.g., VGG, ResNet). To improve feature extraction using a lightweight backbone, we propose a boundary boosting algorithm that optimizes the predicted saliency maps and reduces information collapse in low-dimensional features. The algorithm generates boundary maps based on predicted saliency maps without incurring additional calculations or complexity. As multimodality processing is essential for high-performance SOD, we adopt attentive feature distillation and selection and propose semantic and geometric transfer learning to enhance the backbone without increasing the complexity during testing. Experimental results demonstrate that the proposed LSNet achieves state-of-the-art performance compared with 14 RGB-thermal SOD methods on three datasets while improving the numbers of floating-point operations (1.025G) and parameters (5.39M), model size (22.1 MB), and inference speed (9.95 fps for PyTorch, batch size of 1, and Intel i5-7500 processor; 93.53 fps for PyTorch, batch size of 1, and NVIDIA TITAN V graphics processor; 936.68 fps for PyTorch, batch size of 20, and graphics processor; 538.01 fps for TensorRT and batch size of 1; and 903.01 fps for TensorRT/FP16 and batch size of 1).
引用
收藏
页码:1329 / 1340
页数:12
相关论文
共 77 条
[1]   DHFNet: dual-decoding hierarchical fusion network for RGB-thermal semantic segmentation [J].
Cai, Yuqi ;
Zhou, Wujie ;
Zhang, Liting ;
Yu, Lu ;
Luo, Ting .
VISUAL COMPUTER, 2024, 40 (01) :169-179
[2]   Improved Saliency Detection in RGB-D Images Using Two-Phase Depth Estimation and Selective Deep Fusion [J].
Chen, Chenglizhao ;
Wei, Jipeng ;
Peng, Chong ;
Zhang, Weizhong ;
Qin, Hong .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :4296-4307
[3]   Three-Stream Attention-Aware Network for RGB-D Salient Object Detection [J].
Chen, Hao ;
Li, Youfu .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (06) :2825-2835
[4]   Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection [J].
Chen, Hao ;
Li, Youfu ;
Su, Dan .
PATTERN RECOGNITION, 2019, 86 :376-385
[5]   SALIENT OBJECT DETECTION WITH BOUNDARY INFORMATION [J].
Chen, Kai ;
Wang, Yongxiong ;
Hu, Chuanfei ;
Shao, Hang .
2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,
[6]   DPANet: Depth Potentiality-Aware Gated Attention Network for RGB-D Salient Object Detection [J].
Chen, Zuyao ;
Cong, Runmin ;
Xu, Qianqian ;
Huang, Qingming .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :7012-7024
[7]  
Cheng Y, 2014, IEEE INT CON MULTI
[8]  
Deng ZJ, 2018, PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P684
[9]   BCNet: Bidirectional collaboration network for edge-guided salient object detection [J].
Dong, Bo ;
Zhou, Yan ;
Hu, Chuanfei ;
Fu, Keren ;
Chen, Geng .
NEUROCOMPUTING, 2021, 437 :58-71
[10]   BBS-Net: RGB-D Salient Object Detection with a Bifurcated Backbone Strategy Network [J].
Fan, Deng-Ping ;
Zhai, Yingjie ;
Borji, Ali ;
Yang, Jufeng ;
Shao, Ling .
COMPUTER VISION - ECCV 2020, PT XII, 2020, 12357 :275-292