Weakly Supervised Object Detection Using Proposal- and Semantic-Level Relationships

被引：77

作者：

Zhang, Dingwen ^{[1
]}

Zeng, Wenyuan ^{[1
]}

Yao, Jieru ^{[1
]}

Han, Junwei ^{[1
]}

机构：

[1] Northwestern Polytech Univ, Sch Automat, Brain & Artificial Intelligence Lab, Xian 710072, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2022年 / 44卷 / 06期

基金：

美国国家科学基金会;

关键词：

Cognition; Proposals; Object detection; Supervised learning; Semantics; Task analysis; Network architecture; Weakly supervised object detection; multiple-instance learning; graphical convolutional network;

D O I：

10.1109/TPAMI.2020.3046647

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent years, weakly supervised object detection has attracted great attention in the computer vision community. Although numerous deep learning-based approaches have been proposed in the past few years, such an ill-posed problem is still challenging and the learning performance is still behind the expectation. In fact, most of the existing approaches only consider the visual appearance of each proposal region but ignore to make use of the helpful context information. To this end, this paper introduces two levels of context into the weakly supervised learning framework. The first one is the proposal-level context, i.e., the relationship of the spatially adjacent proposals. The second one is the semantic-level context, i.e., the relationship of the co-occurring object categories. Therefore, the proposed weakly supervised learning framework contains not only the cognition process on the visual appearance but also the reasoning process on the proposal- and semantic-level relationships, which leads to the novel deep multiple instance reasoning framework. Specifically, built upon a conventional CNN-based network architecture, the proposed framework is equipped with two additional graph convolutional network-based reasoning models to implement object location reasoning and multi-label reasoning within an end-to-end network training procedure. Comprehensive experiments on the widely used PASCAL VOC and MS COCO benchmarks have been implemented, which demonstrate the superior capacity of the proposed approach when compared with other state-of-the-art methods and baseline models.

引用

页码：3349 / 3363

页数：15

共 58 条

[51] WSOD2: Learning Bottom-up and Top-down Objectness Distillation for Weakly-supervised Object Detection
Zeng, Zhaoyang
Liu, Bei
Fu, Jianlong
Chao, Hongyang
Zhang, Lei
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8291 - 8299
[52] From Discriminant to Complete: Reinforcement Searching-Agent Learning for Weakly Supervised Object Detection
Zhang, Dingwen
Han, Junwei
Zhao, Long
Zhao, Tao
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (12) : 5549 - 5560
[53] Learning Object Detectors With Semi-Annotated Weak Labels
Zhang, Dingwen
Han, Junwei
Guo, Guangyu
Zhao, Long
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (12) : 3622 - 3635
[54] Leveraging Prior-Knowledge for Weakly Supervised Object Detection Under a Collaborative Self-Paced Curriculum Learning Framework
Zhang, Dingwen
Han, Junwei
Zhao, Long
Meng, Deyu
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2019, 127 (04) : 363 - 380
[55] SPFTN: A Joint Learning Framework for Localizing and Segmenting Objects in Weakly Labeled Videos
Zhang, Dingwen
Han, Junwei
Yang, Le
Xu, Dong
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (02) : 475 - 489
[56] ML-LocNet: Improving Object Localization with Multi-view Learning Network
Zhang, Xiaopeng
Yang, Yang
Feng, Jiashi
[J]. COMPUTER VISION - ECCV 2018, PT III, 2018, 11207 : 248 - 263
[57] Zigzag Learning for Weakly Supervised Object Detection
Zhang, Xiaopeng
Feng, Jiashi
Xiong, Hongkai
Tian, Qi
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4262 - 4270
[58] W2F: A Weakly-Supervised to Fully-Supervised Framework for Object Detection
Zhang, Yongqiang
Bai, Yancheng
Ding, Mingli
Li, Yongqiang
Ghanem, Bernard
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 928 - 936

← 1 2 3 4 5 6 →