Progressive Frame-Proposal Mining for Weakly Supervised Video Object Detection

被引：4

作者：

Han, Mingfei ^{[1
]}

Wang, Yali ^{[2
,3
]}

Li, Mingjie ^{[4
]}

Chang, Xiaojun ^{[1
]}

Yang, Yi ^{[5
]}

Qiao, Yu ^{[2
,3
]}

机构：

[1] Univ Technol Sydney, Australian Artificial Intelligence Inst, Fac Engn & Informat Technol, ReLER Lab, Ultimo, NSW 2007, Australia

[2] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen 518055, Peoples R China

[3] Shanghai Artificial Intelligence Lab, Shanghai 202150, Peoples R China

[4] Stanford Univ, Dept Radiat Oncol, Stanford, CA 94305 USA

[5] Zhejiang Univ, Sch Comp Sci, Hangzhou 310000, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2024年 / 33卷

基金：

澳大利亚研究理事会;

关键词：

Proposals; Object detection; Detectors; Annotations; Task analysis; Training; Benchmark testing; Video object detection; weakly supervised learning; holistic-view refinement;

D O I：

10.1109/TIP.2024.3364536

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we focus on the weakly supervised video object detection problem, where each training video is only tagged with object labels, without any bounding box annotations of objects. To effectively train object detectors from such weakly-annotated videos, we propose a Progressive Frame-Proposal Mining (PFPM) framework by exploiting discriminative proposals in a coarse-to-fine manner. First, we design a flexible Multi-Level Selection (MLS) scheme, with explicit guidance of video tags. By selecting object-relevant frames and mining important proposals from these frames, the proposed MLS can effectively reduce frame redundancy as well as improve proposal effectiveness to boost weakly-supervised detectors. Moreover, we develop a novel Holistic-View Refinement (HVR) scheme, which can globally evaluate importance of proposals among frames, and thus correctly refine pseudo ground truth boxes for training video detectors in a self-supervised manner. Finally, we evaluate the proposed PFPM on a large-scale benchmark for video object detection, on ImageNet VID, under the setting of weak annotations. The experimental results demonstrate that our PFPM significantly outperforms the state-of-the-art weakly-supervised detectors.

引用

页码：1560 / 1573

页数：14

共 50 条

[41] Weakly-Supervised Video Anomaly Detection With Snippet Anomalous Attention
Fan, Yidan
Yu, Yongxin
Lu, Wenhuan
Han, Yahong
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (07) : 5480 - 5492
[42] Collaborative Normality Learning Framework for Weakly Supervised Video Anomaly Detection
Liu, Yang
Liu, Jing
Zhao, Mengyang
Li, Shuang
Song, Liang
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2022, 69 (05) : 2508 - 2512
[43] A Visual Representation-Guided Framework With Global Affinity for Weakly Supervised Salient Object Detection
Xu, Binwei
Liang, Haoran
Gong, Weihua
Liang, Ronghua
Chen, Peng
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (01) : 248 - 259
[44] From Discriminant to Complete: Reinforcement Searching-Agent Learning for Weakly Supervised Object Detection
Zhang, Dingwen
Han, Junwei
Zhao, Long
Zhao, Tao
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (12) : 5549 - 5560
[45] Weakly-supervised object detection via mining pseudo ground truth bounding-boxes
Zhang, Yongqiang
Bai, Yaicheng
Ding, Mingli
Li, Yongqiang
Ghanem, Bernard
PATTERN RECOGNITION, 2018, 84 : 68 - 81
[46] PistonNet: Object Separating From Background by Attention for Weakly Supervised Ship Detection
Yang, Yi
Pan, Zongxu
Hu, Yuxin
Ding, Chibiao
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2022, 15 : 5190 - 5202
[47] Weakly Supervised Monocular 3D Object Detection by Spatial-Temporal View Consistency
Han, Wencheng
Tao, Runzhou
Ling, Haibin
Shen, Jianbing
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (01) : 84 - 98
[48] Weakly-Supervised Salient Object Detection on Light Fields
Liang, Zijian
Wang, Pengjie
Xu, Ke
Zhang, Pingping
Lau, Rynson W. H.
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 6295 - 6305
[49] Forget and Diversify: Regularized Refinement for Weakly Supervised Object Detection
Son, Jeany
Kim, Daniel
Lee, Solae
Kwak, Suha
Cho, Minsu
Han, Bohyung
COMPUTER VISION - ACCV 2018, PT IV, 2019, 11364 : 632 - 648
[50] Weakly-supervised Human-object Interaction Detection
Sugimoto, Masaki
Furuta, Ryosuke
Taniguchi, Yukinobu
VISAPP: PROCEEDINGS OF THE 16TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS - VOL. 5: VISAPP, 2021, : 293 - 300

← 1 2 3 4 5 →