Segmenting Objects From Relational Visual Data

被引:112
作者
Lu, Xiankai [1 ]
Wang, Wenguan [2 ]
Shen, Jianbing [3 ]
Crandall, David J. [4 ]
Van Gool, Luc [2 ]
机构
[1] Shangdong Univ, Sch Software, Jinan 250100, Shandong, Peoples R China
[2] Swiss Fed Inst Technol, CH-8092 Zurich, Switzerland
[3] Univ Macau, Dept Comp & Informat Sci, State Key Lab Internet Things Smart City, Macau, Peoples R China
[4] Indiana Univ, Luddy Sch Informat Comp & Engn, Bloomington, IN 47405 USA
基金
中国国家自然科学基金;
关键词
Image segmentation; Visualization; Integrated circuits; Task analysis; Frequency selective surfaces; Semantics; Message passing; Graph neural network; automatic video segmentation; image co-segmentation; few-shot semantic segmentation; SEGMENTATION; NETWORK;
D O I
10.1109/TPAMI.2021.3115815
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, we model a set of pixelwise object segmentation tasks - automatic video segmentation (AVS), image co-segmentation (ICS) and few-shot semantic segmentation (FSS) - in a unified view of segmenting objects from relational visual data. To this end, we propose an attentive graph neural network (AGNN) that addresses these tasks in a holistic fashion, by formulating them as a process of iterative information fusion over data graphs. It builds a fully-connected graph to efficiently represent visual data as nodes and relations between data instances as edges. The underlying relations are described by a differentiable attention mechanism, which thoroughly examines fine-grained semantic similarities between all the possible location pairs in two data instances. Through parametric message passing, AGNN is able to capture knowledge from the relational visual data, enabling more accurate object discovery and segmentation. Experiments show that AGNN can automatically highlight primary foreground objects from video sequences (i.e., automatic video segmentation), and extract common objects from noisy collections of semantically related images (i.e., image co-segmentation). AGNN can even generalize segment new categories with little annotated data (i.e., few-shot semantic segmentation). Taken together, our results demonstrate that AGNN provides a powerful tool that is applicable to a wide range of pixel-wise object pattern understanding tasks with relational visual data. Our algorithm implementations have been made publicly available at https://github.com/carrierlxk/AGNN.
引用
收藏
页码:7885 / 7897
页数:13
相关论文
共 96 条
  • [1] [Anonymous], 2014, PROC BRIT MACH VIS C
  • [2] Ballas N., 2016, PROC INT C LEARN REP
  • [3] Banerjee S, 2019, PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P673
  • [4] Brox T, 2010, LECT NOTES COMPUT SC, V6315, P282, DOI 10.1007/978-3-642-15555-0_21
  • [5] Semantic Aware Attention Based Deep Object Co-segmentation
    Chen, Hong
    Huang, Yifei
    Nakayama, Hideki
    [J]. COMPUTER VISION - ACCV 2018, PT IV, 2019, 11364 : 435 - 450
  • [6] Chen LC, 2017, Arxiv, DOI [arXiv:1706.05587, 10.48550/arxiv.1706.05587.ArXiv, DOI 10.48550/ARXIV.1706.05587.ARXIV]
  • [7] Predicting Multiple Attributes via Relative Multi-task Learning
    Chen, Lin
    Zhang, Qiang
    Li, Baoxin
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 1027 - 1034
  • [8] SegFlow: Joint Learning for Video Object Segmentation and Optical Flow
    Cheng, Jingchun
    Tsai, Yi-Hsuan
    Wang, Shengjin
    Yang, Ming-Hsuan
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 686 - 695
  • [9] Global Contrast based Salient Region Detection
    Cheng, Ming-Ming
    Zhang, Guo-Xin
    Mitra, Niloy J.
    Huang, Xiaolei
    Hu, Shi-Min
    [J]. 2011 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2011, : 409 - 416
  • [10] Cho K., 2014, P C EMP METH NAT LAN, P1724