Instances as Queries

被引:185
作者
Fang, Yuxin [1 ]
Yang, Shusheng [1 ,2 ]
Wang, Xinggang [1 ]
Li, Yu [2 ]
Fang, Chen [3 ]
Shan, Ying [2 ]
Feng, Bin [1 ]
Liu, Wenyu [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch EIC, Wuhan, Peoples R China
[2] Tencent PCG, Appl Res Ctr ARC, Shenzhen, Peoples R China
[3] Tencent, Shenzhen, Peoples R China
来源
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021) | 2021年
关键词
D O I
10.1109/ICCV48922.2021.00683
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present QueryInst, a new perspective for instance segmentation. QueryInst is a multi-stage end-to-end system that treats instances of interest as learnable queries, enabling query based object detectors, e.g., Sparse R-CNN, to have strong instance segmentation performance. The attributes of instances such as categories, bounding boxes, instance masks, and instance association embeddings are represented by queries in a unified manner. In QueryInst, a query is shared by both detection and segmentation via dynamic convolutions and driven by parallelly-supervised multi-stage learning. We conduct extensive experiments on three challenging benchmarks, i.e., COCO, CityScapes, and YouTube-VIS to evaluate the effectiveness of QueryInst in object detection, instance segmentation, and video instance segmentation tasks. For the first time, we demonstrate that a simple end-to-end query based framework can achieve the state-of-the-art performance in various instance-level recognition tasks.
引用
收藏
页码:6890 / 6899
页数:10
相关论文
共 63 条
[1]  
Aaron~ van den Oord Yazhe Li, 2018, PR MACH LEARN RES, P3918
[2]  
[Anonymous], 2016, INT CONF 3D VISION, DOI DOI 10.1109/3DV.2016.79
[3]  
[Anonymous], 2016, Advances in Neural Information Processing Systems
[4]  
[Anonymous], 2018, ABS181010327 CORR
[5]   STEm-Seg: Spatio-Temporal Embeddings for Instance Segmentation in Videos [J].
Athar, Ali ;
Mahadevan, Sabarinath ;
Osep, Aljosa ;
Leal-Taixe, Laura ;
Leibe, Bastian .
COMPUTER VISION - ECCV 2020, PT XI, 2020, 12356 :158-177
[6]  
Bolya D., 2019, YOLACT BETTER REAL T
[7]  
Bolya Daniel, 2019, ICCV
[8]   Cascade R-CNN: High Quality Object Detection and Instance Segmentation [J].
Cai, Zhaowei ;
Vasconcelos, Nuno .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (05) :1483-1498
[9]   SipMask: Spatial Information Preservation for Fast Image and Video Instance Segmentation [J].
Cao, Jiale ;
Anwer, Rao Muhammad ;
Cholakkal, Hisham ;
Khan, Fahad Shahbaz ;
Pang, Yanwei ;
Shao, Ling .
COMPUTER VISION - ECCV 2020, PT XIV, 2020, 12359 :1-18
[10]  
Carion Nicolas, 2020, EUROPEAN C COMPUTER