LARNet: Towards Lightweight, Accurate and Real-Time Salient Object Detection

被引：6

作者：

Wang, Zhenyu ^{[1
,2
]}

Zhang, Yunzhou ^{[3
]}

Liu, Yan ^{[3
]}

Qin, Cao ^{[3
]}

Coleman, Sonya A. ^{[4
]}

Kerr, Dermot ^{[4
]}

机构：

[1] Northeastern Univ, Fac Robot Sci & Engn, Shenyang 110819, Peoples R China

[2] Tech Univ Munich, D-80333 Munich, Germany

[3] Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Peoples R China

[4] Ulster Univ, Intelligent Syst Res Ctr, Londonderry BT48 7JL, England

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2024年 / 26卷

基金：

中国国家自然科学基金;

关键词：

Object detection; Real-time systems; Neurons; Feature extraction; Visualization; Computational modeling; Performance evaluation; Context gating module; feature fusion; lightweight; saliency backbone network; salient object detection; VISUAL-ATTENTION; NETWORK; MODEL;

D O I：

10.1109/TMM.2023.3330082

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Salient object detection (SOD) has rapidly developed in recent years, and detection performance has greatly improved. However, the price of these improvements is increasingly complex networks that require more computing resources and sacrifice real-time performance. This makes it difficult to deploy these approaches on devices with limited computing resources (such as mobile phones, embedded platforms, etc.). Considering recently developed lightweight SOD models, their detection and real-time performance are always compromised in demanding practical application scenarios. To solve these problems, we propose a novel lightweight SOD method called LARNet and its corresponding extremely lightweight method LARNet* according to application requirements. These methods balance the relationship between lightweight requirements, detection accuracy and real-time performance. First, we propose a saliency backbone network tailored for SOD, which removes the need for pre-training with ImageNet and effectively reduces feature redundancy. Subsequently, we propose a novel context gating module (CGM), which simulates the physiological mechanism of human brain neurons and visual information processing, and realizes the deep fusion of multi-level features at the global level. Finally, the saliency map is output after fusion of multi-level features. Extensive experiments on popular benchmark datasets demonstrate that the proposed LARNet (LARNet*) achieves 98 (113) FPS on a GPU and 3 (6) FPS on a CPU. With approximately 680 K (90 K) parameters, the model has significant performance advantages over (extremely) lightweight methods, even surpassing some heavyweight models.

引用

页码：5207 / 5222

页数：16

共 99 条

[1]

Achanta R, 2009, PROC CVPR IEEE, P1597, DOI 10.1109/CVPRW.2009.5206596

[2] Complex networks driven salient region detection based on superpixel segmentation [J].

Aksac, Alper ;

Ozyer, Tansel ;

Alhajj, Reda .

PATTERN RECOGNITION, 2017, 66 :268-279

[3] MFS: A Brain-Inspired Memory Formation System for GAN [J].

Chang, Yifan ;

Wang, Yifan ;

Peng, Jian ;

Dong, Ziyi ;

Li, Haifeng ;

Li, Wenbo .

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (08) :2598-2610

[4]

Chen ZY, 2020, AAAI CONF ARTIF INTE, V34, P10599

[5] A Highly Efficient Model to Study the Semantics of Salient Object Detection [J].

Cheng, Ming-Ming ;

Gao, Shang-Hua ;

Borji, Ali ;

Tan, Yong-Qiang ;

Lin, Zheng ;

Wang, Meng .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (11) :8006-8021

[6] Global Contrast Based Salient Region Detection [J].

Cheng, Ming-Ming ;

Mitra, Niloy J. ;

Huang, Xiaolei ;

Torr, Philip H. S. ;

Hu, Shi-Min .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (03) :569-582

[7]

Craye C, 2016, IEEE INT CONF ROBOT, P2303, DOI 10.1109/ICRA.2016.7487379

[8] Re-Thinking Co-Salient Object Detection [J].

Fan, Deng-Ping ;

Li, Tengpeng ;

Lin, Zheng ;

Ji, Ge-Peng ;

Zhang, Dingwen ;

Cheng, Ming-Ming ;

Fu, Huazhu ;

Shen, Jianbing .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (08) :4339-4354

[9] Structure-measure: A New Way to Evaluate Foreground Maps [J].

Fan, Deng-Ping ;

Cheng, Ming-Ming ;

Liu, Yun ;

Li, Tao ;

Borji, Ali .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :4558-4567

[10]

Fan DP, 2018, PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P698

← 1 2 3 4 5 6 7 8 9 10 →