Re-Thinking Co-Salient Object Detection

被引:80
作者
Fan, Deng-Ping [1 ]
Li, Tengpeng [2 ,3 ]
Lin, Zheng [1 ]
Ji, Ge-Peng [4 ]
Zhang, Dingwen [5 ]
Cheng, Ming-Ming [1 ]
Fu, Huazhu [6 ]
Shen, Jianbing [7 ]
机构
[1] Nankai Univ, Coll Comp Sci, Tianjin 300071, Peoples R China
[2] Nanjing Univ Informat Sci & Technol, B DAT, Tianjin 300071, Peoples R China
[3] Nanjing Univ Informat Sci & Technol, CICAEET, Tianjin 300071, Peoples R China
[4] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Hubei, Peoples R China
[5] Northwestern Polytech Univ, Brain & Artificial Intelligence Lab, Sch Automat, Xian 710072, Peoples R China
[6] Incept Inst Artificial Intelligence, Abu Dhabi, U Arab Emirates
[7] Univ Macau, Dept Comp & Informat Sci, State Key Lab Internet Things Smart City, Macau, Peoples R China
关键词
Benchmark testing; Object detection; Measurement; Semantics; Task analysis; Annotations; Optimization; Co-salient object detection; co-attention projection; CoSOD dataset; benchmark; DEEP; SEGMENTATION; DISCOVERY; FEATURES;
D O I
10.1109/TPAMI.2021.3060412
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, we conduct a comprehensive study on the co-salient object detection (CoSOD) problem for images. CoSOD is an emerging and rapidly growing extension of salient object detection (SOD), which aims to detect the co-occurring salient objects in a group of images. However, existing CoSOD datasets often have a serious data bias, assuming that each group of images contains salient objects of similar visual appearances. This bias can lead to the ideal settings and effectiveness of models trained on existing datasets, being impaired in real-life situations, where similarities are usually semantic or conceptual. To tackle this issue, we first introduce a new benchmark, called CoSOD3k in the wild, which requires a large amount of semantic context, making it more challenging than existing CoSOD datasets. Our CoSOD3k consists of 3,316 high-quality, elaborately selected images divided into 160 groups with hierarchical annotations. The images span a wide range of categories, shapes, object sizes, and backgrounds. Second, we integrate the existing SOD techniques to build a unified, trainable CoSOD framework, which is long overdue in this field. Specifically, we propose a novel CoEG-Net that augments our prior model EGNet with a co-attention projection strategy to enable fast common information learning. CoEG-Net fully leverages previous large-scale SOD datasets and significantly improves the model scalability and stability. Third, we comprehensively summarize 40 cutting-edge algorithms, benchmarking 18 of them over three challenging CoSOD datasets (iCoSeg, CoSal2015, and our CoSOD3k), and reporting more detailed (i.e., group-level) performance analysis. Finally, we discuss the challenges and future works of CoSOD. We hope that our study will give a strong boost to growth in the CoSOD community. The benchmark toolbox and results are available on our project page at https://dpfan.net/CoSOD3K.
引用
收藏
页码:4339 / 4354
页数:16
相关论文
共 129 条
[1]  
Achanta R, 2009, PROC CVPR IEEE, P1597, DOI 10.1109/CVPRW.2009.5206596
[2]  
Alpert S, 2007, PROC CVPR IEEE, P359
[3]  
[Anonymous], 2010, P 23 ANN ACM S US IN
[4]  
[Anonymous], 2 INT C LEARN REPR
[5]   Contour Detection and Hierarchical Image Segmentation [J].
Arbelaez, Pablo ;
Maire, Michael ;
Fowlkes, Charless ;
Malik, Jitendra .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (05) :898-916
[6]   Explorable Super Resolution [J].
Bahat, Yuval ;
Michaeli, Tomer .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :2713-2722
[7]   iCoseg: Interactive Co-segmentation with Intelligent Scribble Guidance [J].
Batra, Dhruv ;
Kowdle, Adarsh ;
Parikh, Devi ;
Luo, Jiebo ;
Chen, Tsuhan .
2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, :3169-3176
[8]   Learning Deep Architectures for AI [J].
Bengio, Yoshua .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2009, 2 (01) :1-127
[9]   Salient Object Detection: A Benchmark [J].
Borji, Ali ;
Cheng, Ming-Ming ;
Jiang, Huaizu ;
Li, Jia .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (12) :5706-5722
[10]   Salient object detection: A survey [J].
Borji, Ali ;
Cheng, Ming-Ming ;
Hou, Qibin ;
Jiang, Huaizu ;
Li, Jia .
COMPUTATIONAL VISUAL MEDIA, 2019, 5 (02) :117-150