Salient object detection in egocentric videos

被引：1

作者：

Zhang, Hao ^{[1
]}

Liang, Haoran ^{[1
]}

Zhao, Xing ^{[1
]}

Liu, Jian ^{[1
]}

Liang, Ronghua ^{[1
]}

机构：

[1] Zhejiang Univ Technol, Hangzhou, Peoples R China

来源：

IET IMAGE PROCESSING | 2024年 / 18卷 / 08期

基金：

中国国家自然科学基金;

关键词：

image processing; object detection; SEGMENTATION; TRACKING;

D O I：

10.1049/ipr2.13080

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In the realm of video salient object detection (VSOD), the majority of research has traditionally been centered on third-person perspective videos. However, this focus overlooks the unique requirements of certain first-person tasks, such as autonomous driving or robot vision. To bridge this gap, a novel dataset and a camera-based VSOD model, CaMSD, specifically designed for egocentric videos, is introduced. First, the SalEgo dataset, comprising 17,400 fully annotated frames for video salient object detection, is presented. Second, a computational model that incorporates a camera movement module is proposed, designed to emulate the patterns observed when humans view videos. Additionally, to achieve precise segmentation of a single salient object during switches between salient objects, as opposed to simultaneously segmenting two objects, a saliency enhancement module based on the Squeeze and Excitation Block is incorporated. Experimental results show that the approach outperforms other state-of-the-art methods in egocentric video salient object detection tasks. Dataset and codes can be found at . We propose a new egocentric video salient object detection (VSOD) dataset SalEgo. And we propose a new Camera Movement based method CaMSD for the new dataset and compare to some models. Experimental results show that our approach outperforms other state-of-the-art methods in egocentric video salient object detection tasks. image

引用

页码：2028 / 2037

页数：10

共 58 条

[1] Achanta R, 2009, PROC CVPR IEEE, P1597, DOI 10.1109/CVPRW.2009.5206596
[2] Spatiotemporal Saliency Estimation by Spectral Foreground Detection
Aytekin, Caglar
Possegger, Horst
Mauthner, Thomas
Kiranyaz, Serkan
Bischof, Horst
Gabbouj, Moncef
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (01) : 82 - 95
[3] My View is the Best View: Procedure Learning from Egocentric Videos
Bansal, Siddhant
Arora, Chetan
Jawahar, C. V.
[J]. COMPUTER VISION, ECCV 2022, PT XIII, 2022, 13673 : 657 - 675
[4] Brox T, 2010, LECT NOTES COMPUT SC, V6315, P282, DOI 10.1007/978-3-642-15555-0_21
[5] Cai MJ, 2016, ROBOTICS: SCIENCE AND SYSTEMS XII
[6] Cheng Ho Kei, 2022, ECCV
[7] Global Contrast based Salient Region Detection
Cheng, Ming-Ming
Zhang, Guo-Xin
Mitra, Niloy J.
Huang, Xiaolei
Hu, Shi-Min
[J]. 2011 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2011, : 409 - 416
[8] Damen D., 2020, RESCALING EGOCENTRIC
[9] Scaling Egocentric Vision: The EPIC-KITCHENS Dataset
Damen, Dima
Doughty, Hazel
Farinella, Giovanni Maria
Fidler, Sanja
Furnari, Antonino
Kazakos, Evangelos
Moltisanti, Davide
Munro, Jonathan
Perrett, Toby
Price, Will
Wray, Michael
[J]. COMPUTER VISION - ECCV 2018, PT IV, 2018, 11208 : 753 - 771
[10] Summarization of Egocentric Videos: A Comprehensive Survey
del Molino, Ana Garcia
Tan, Cheston
Lim, Joo-Hwee
Tan, Ah-Hwee
[J]. IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, 2017, 47 (01) : 65 - 76

← 1 2 3 4 5 6 →