Progressive Frame-Proposal Mining for Weakly Supervised Video Object Detection

被引:4
|
作者
Han, Mingfei [1 ]
Wang, Yali [2 ,3 ]
Li, Mingjie [4 ]
Chang, Xiaojun [1 ]
Yang, Yi [5 ]
Qiao, Yu [2 ,3 ]
机构
[1] Univ Technol Sydney, Australian Artificial Intelligence Inst, Fac Engn & Informat Technol, ReLER Lab, Ultimo, NSW 2007, Australia
[2] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen 518055, Peoples R China
[3] Shanghai Artificial Intelligence Lab, Shanghai 202150, Peoples R China
[4] Stanford Univ, Dept Radiat Oncol, Stanford, CA 94305 USA
[5] Zhejiang Univ, Sch Comp Sci, Hangzhou 310000, Peoples R China
基金
澳大利亚研究理事会;
关键词
Proposals; Object detection; Detectors; Annotations; Task analysis; Training; Benchmark testing; Video object detection; weakly supervised learning; holistic-view refinement;
D O I
10.1109/TIP.2024.3364536
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we focus on the weakly supervised video object detection problem, where each training video is only tagged with object labels, without any bounding box annotations of objects. To effectively train object detectors from such weakly-annotated videos, we propose a Progressive Frame-Proposal Mining (PFPM) framework by exploiting discriminative proposals in a coarse-to-fine manner. First, we design a flexible Multi-Level Selection (MLS) scheme, with explicit guidance of video tags. By selecting object-relevant frames and mining important proposals from these frames, the proposed MLS can effectively reduce frame redundancy as well as improve proposal effectiveness to boost weakly-supervised detectors. Moreover, we develop a novel Holistic-View Refinement (HVR) scheme, which can globally evaluate importance of proposals among frames, and thus correctly refine pseudo ground truth boxes for training video detectors in a self-supervised manner. Finally, we evaluate the proposed PFPM on a large-scale benchmark for video object detection, on ImageNet VID, under the setting of weak annotations. The experimental results demonstrate that our PFPM significantly outperforms the state-of-the-art weakly-supervised detectors.
引用
收藏
页码:1560 / 1573
页数:14
相关论文
共 50 条
  • [21] A progressive segmentation with weight contrast label enhancement for weakly supervised video salient object detection
    Lu, Zelin
    Liang, Haoran
    Xu, Binwei
    Liang, Ronghua
    IET IMAGE PROCESSING, 2023, 17 (10) : 2925 - 2936
  • [22] Mining High-Quality Pseudoinstance Soft Labels for Weakly Supervised Object Detection in Remote Sensing Images
    Qian, Xiaoliang
    Huo, Yu
    Cheng, Gong
    Gao, Chenyang
    Yao, Xiwen
    Wang, Wei
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [23] MPLA-Net: Multiple Pseudo Label Aggregation Network for Weakly Supervised Video Salient Object Detection
    Ma, Chunjie
    Du, Lina
    Zhuo, Li
    Li, Jiafeng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (05) : 3905 - 3918
  • [24] SAENet: Self-Supervised Adversarial and Equivariant Network for Weakly Supervised Object Detection in Remote Sensing Images
    Feng, Xiaoxu
    Yao, Xiwen
    Cheng, Gong
    Han, Jungong
    Han, Junwei
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [25] Joint Multisource Saliency and Exemplar Mechanism for Weakly Supervised Video Object Segmentation
    En, Qing
    Duan, Lijuan
    Zhang, Zhaoxiang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 (30) : 8155 - 8169
  • [26] Explicit and Implicit Box Equivariance Learning for Weakly-Supervised Rotated Object Detection
    Wang, Linfei
    Zhan, Yibing
    Lin, Xu
    Yu, Baosheng
    Ding, Liang
    Zhu, Jianqing
    Tao, Dapeng
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2025, 9 (01): : 509 - 521
  • [27] Weakly-Supervised Saliency Detection via Salient Object Subitizing
    Zheng, Xiaoyang
    Tan, Xin
    Zhou, Jie
    Ma, Lizhuang
    Lau, Rynson W. H.
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (11) : 4370 - 4380
  • [28] Proposal-Refined Weakly Supervised Object Detection in Underwater Images
    Lv, Xiaoqian
    Wang, An
    Liu, Qinglin
    Sun, Jiamin
    Zhang, Shengping
    IMAGE AND GRAPHICS, ICIG 2019, PT I, 2019, 11901 : 418 - 428
  • [29] WINDOW MINING BY CLUSTERING MID-LEVEL REPRESENTATION FOR WEAKLY SUPERVISED OBJECT DETECTION
    Wang, Chong
    Ren, Weiqiang
    Huang, Kaiqi
    2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2014, : 4067 - 4071
  • [30] Weakly-Supervised Video Object Grounding via Learning Uni-Modal Associations
    Wang, Wei
    Gao, Junyu
    Xu, Changsheng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 6329 - 6340