Progressive Frame-Proposal Mining for Weakly Supervised Video Object Detection

被引:4
作者
Han, Mingfei [1 ]
Wang, Yali [2 ,3 ]
Li, Mingjie [4 ]
Chang, Xiaojun [1 ]
Yang, Yi [5 ]
Qiao, Yu [2 ,3 ]
机构
[1] Univ Technol Sydney, Australian Artificial Intelligence Inst, Fac Engn & Informat Technol, ReLER Lab, Ultimo, NSW 2007, Australia
[2] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen 518055, Peoples R China
[3] Shanghai Artificial Intelligence Lab, Shanghai 202150, Peoples R China
[4] Stanford Univ, Dept Radiat Oncol, Stanford, CA 94305 USA
[5] Zhejiang Univ, Sch Comp Sci, Hangzhou 310000, Peoples R China
基金
澳大利亚研究理事会;
关键词
Proposals; Object detection; Detectors; Annotations; Task analysis; Training; Benchmark testing; Video object detection; weakly supervised learning; holistic-view refinement;
D O I
10.1109/TIP.2024.3364536
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we focus on the weakly supervised video object detection problem, where each training video is only tagged with object labels, without any bounding box annotations of objects. To effectively train object detectors from such weakly-annotated videos, we propose a Progressive Frame-Proposal Mining (PFPM) framework by exploiting discriminative proposals in a coarse-to-fine manner. First, we design a flexible Multi-Level Selection (MLS) scheme, with explicit guidance of video tags. By selecting object-relevant frames and mining important proposals from these frames, the proposed MLS can effectively reduce frame redundancy as well as improve proposal effectiveness to boost weakly-supervised detectors. Moreover, we develop a novel Holistic-View Refinement (HVR) scheme, which can globally evaluate importance of proposals among frames, and thus correctly refine pseudo ground truth boxes for training video detectors in a self-supervised manner. Finally, we evaluate the proposed PFPM on a large-scale benchmark for video object detection, on ImageNet VID, under the setting of weak annotations. The experimental results demonstrate that our PFPM significantly outperforms the state-of-the-art weakly-supervised detectors.
引用
收藏
页码:1560 / 1573
页数:14
相关论文
共 50 条
  • [41] Weakly-Supervised Video Anomaly Detection With Snippet Anomalous Attention
    Fan, Yidan
    Yu, Yongxin
    Lu, Wenhuan
    Han, Yahong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (07) : 5480 - 5492
  • [42] Collaborative Normality Learning Framework for Weakly Supervised Video Anomaly Detection
    Liu, Yang
    Liu, Jing
    Zhao, Mengyang
    Li, Shuang
    Song, Liang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2022, 69 (05) : 2508 - 2512
  • [43] A Visual Representation-Guided Framework With Global Affinity for Weakly Supervised Salient Object Detection
    Xu, Binwei
    Liang, Haoran
    Gong, Weihua
    Liang, Ronghua
    Chen, Peng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (01) : 248 - 259
  • [44] From Discriminant to Complete: Reinforcement Searching-Agent Learning for Weakly Supervised Object Detection
    Zhang, Dingwen
    Han, Junwei
    Zhao, Long
    Zhao, Tao
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (12) : 5549 - 5560
  • [45] Weakly-supervised object detection via mining pseudo ground truth bounding-boxes
    Zhang, Yongqiang
    Bai, Yaicheng
    Ding, Mingli
    Li, Yongqiang
    Ghanem, Bernard
    PATTERN RECOGNITION, 2018, 84 : 68 - 81
  • [46] PistonNet: Object Separating From Background by Attention for Weakly Supervised Ship Detection
    Yang, Yi
    Pan, Zongxu
    Hu, Yuxin
    Ding, Chibiao
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2022, 15 : 5190 - 5202
  • [47] Weakly Supervised Monocular 3D Object Detection by Spatial-Temporal View Consistency
    Han, Wencheng
    Tao, Runzhou
    Ling, Haibin
    Shen, Jianbing
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (01) : 84 - 98
  • [48] Weakly-Supervised Salient Object Detection on Light Fields
    Liang, Zijian
    Wang, Pengjie
    Xu, Ke
    Zhang, Pingping
    Lau, Rynson W. H.
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 6295 - 6305
  • [49] Forget and Diversify: Regularized Refinement for Weakly Supervised Object Detection
    Son, Jeany
    Kim, Daniel
    Lee, Solae
    Kwak, Suha
    Cho, Minsu
    Han, Bohyung
    COMPUTER VISION - ACCV 2018, PT IV, 2019, 11364 : 632 - 648
  • [50] Weakly-supervised Human-object Interaction Detection
    Sugimoto, Masaki
    Furuta, Ryosuke
    Taniguchi, Yukinobu
    VISAPP: PROCEEDINGS OF THE 16TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS - VOL. 5: VISAPP, 2021, : 293 - 300