TIVE: A toolbox for identifying video instance segmentation errors

被引:4
作者
Jia, Wenhe [1 ]
Yang, Lu [1 ]
Jia, Zilong [1 ]
Zhao, Wenyi [1 ]
Zhou, Yilin [1 ]
Song, Qing [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Artificial Intelligence Acad, 10th Xitucheng Rd, Beijing 100876, Peoples R China
关键词
Video instance segmentation; Error analyzing toolbox; Fine-grained metrics; OBJECT SEGMENTATION; NETWORK;
D O I
10.1016/j.neucom.2023.126321
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we introduce TIVE, a Toolbox for Identifying Video instance segmentation Errors. By directly operating output prediction files, TIVE defines isolated error types and weights each type's dam-age to mAP, for the purpose of distinguishing model characters. By decomposing localization quality in spatial-temporal dimensions, model's potential drawbacks on spatial segmentation and temporal asso-ciation can be revealed. TIVE can also report mAP over instance temporal length for real applications. We conduct extensive experiments by the toolbox to further illustrate how spatial segmentation and temporal association affect each other. We expect the analysis of TIVE can give the researchers more insights, guiding the community to promote more meaningful explorations for video instance segmenta-tion. The proposed toolbox is available at https://github.com/wenhe-jia/TIVE. (c) 2023 Elsevier B.V. All rights reserved.
引用
收藏
页数:12
相关论文
共 79 条
[1]   Diagnosing Error in Temporal Action Detectors [J].
Alwassel, Humam ;
Heilbron, Fabian Caba ;
Escorcia, Victor ;
Ghanem, Bernard .
COMPUTER VISION - ECCV 2018, PT III, 2018, 11207 :264-280
[2]   Classifying, Segmenting, and Tracking Object Instances in Video with Mask Propagation [J].
Bertasius, Gedas ;
Torresani, Lorenzo .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :9736-9745
[3]   TIDE: A General Toolbox for Identifying Object Detection Errors [J].
Bolya, Daniel ;
Foley, Sean ;
Hays, James ;
Hoffman, Judy .
COMPUTER VISION - ECCV 2020, PT III, 2020, 12348 :558-573
[4]  
Borji A, 2019, Arxiv, DOI arXiv:1911.12451
[5]   Weakly supervised video object segmentation initialized with referring expression [J].
Bu, Xiaoqing ;
Sun, Yukuan ;
Wang, Jianming ;
Liu, Kunliang ;
Liang, Jiayu ;
Jin, Guanghao ;
Chung, Tae-Sun .
NEUROCOMPUTING, 2021, 453 :754-765
[6]  
Caelles S, 2019, Arxiv, DOI [arXiv:1905.00737, 10.48550/arXiv.1905.00737]
[7]  
Cao J., 2020, EUROPEAN C COMPUTER, P1
[8]   VisDrone-MOT2021: The Vision Meets Drone Multiple Object Tracking Challenge Results [J].
Chen, Guanlin ;
Wang, Wenguan ;
He, Zhijian ;
Wang, Lujia ;
Yuan, Yixuan ;
Zhang, Dingwen ;
Zhang, Jinglin ;
Zhu, Pengfei ;
Van Gool, Luc ;
Han, Junwei ;
Hoi, Steven ;
Hu, Qinghua ;
Liu, Ming ;
Sciarrone, Andrea ;
Sun, Chao ;
Garibotto, Chiara ;
Duong Nguyen-Ngoc Tran ;
Lavagetto, Fabio ;
Haleem, Halar ;
Motorcu, Hakki ;
Ates, Hasan F. ;
Huy-Hung Nguyen ;
Jeon, Hyung-Joon ;
Bisio, Igor ;
Jeon, Jae Wook ;
Li, Jiahao ;
Long Hoang Pham ;
Jeon, Moongu ;
Feng, Qianyu ;
Li, Shengwen ;
Tai Huu-Phuong Tran ;
Pan, Xiao ;
Song, Young-min ;
Yao, Yuehan ;
Du, Yunhao ;
Xu, Zhenyu ;
Luo, Zhipeng .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, :2839-2846
[9]  
Chen S., 2021, ARXIV
[10]  
Cheng B., 2021, arXiv, DOI DOI 10.48550/ARXIV.2112.10764