Generalized Weakly Supervised Object Localization

被引:19
作者
Zhang, Dingwen [1 ,2 ]
Guo, Guangyu [1 ]
Zeng, Wenyuan [1 ]
Li, Lei [3 ]
Han, Junwei [2 ]
机构
[1] Northwestern Polytech Univ, Sch Automat, Brain & Artificial Intelligence Lab, Xian 710129, Peoples R China
[2] Hefei Comprehens Natl Sci Ctr, Inst Artificial Intelligence, Hefei 230026, Peoples R China
[3] Sci & Technol Complex Syst Control & Intelligent, Beijing 100074, Peoples R China
基金
美国国家科学基金会;
关键词
Location awareness; Semantics; Annotations; Task analysis; Feature extraction; Training; Manuals; Object localization; unseen object category; weakly supervised learning;
D O I
10.1109/TNNLS.2022.3204337
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the goal of learning to localize specific object semantics using the low-cost image-level annotation, weakly supervised object localization (WSOL) has been receiving increasing attention in recent years. Although existing literatures have studied a number of major issues in this field, one important yet challenging scenario, where the test object semantics may appear in the training phase (seen categories) or never been observed before (unseen categories), is still beyond the exploration of the existing works. We define this scenario as the generalized WSOL (GWSOL) and make a pioneering effort to study it in this article. By leveraging attribute vectors to associate seen and unseen categories, we involve threefold modeling components, i.e., the class-sensitive modeling, semantic-agnostic modeling, and content-aware modeling, into a unified end-to-end learning framework. Such design enables our model to recognize and localize unconstrained object semantics, learn compact and discriminative features that could represent the potential unseen categories, and customize content-aware attribute weights to avoid localizing on misleading attribute elements. To advance this research direction, we contribute the bounding-box manual annotations to the widely used AwA2 dataset and benchmark the GWSOL methods. Comprehensive experiments demonstrate the effectiveness of our proposed learning framework and each of the considered modeling components.
引用
收藏
页码:5395 / 5406
页数:12
相关论文
共 69 条
  • [51] Background-Click Supervision for Temporal Action Localization
    Yang, Le
    Han, Junwei
    Zhao, Tao
    Lin, Tianwei
    Zhang, Dingwen
    Chen, Jianxin
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (12) : 9814 - 9829
  • [52] Yang S, 2020, IEEE WINT CONF APPL, P2930, DOI 10.1109/WACV45572.2020.9093566
  • [53] Designing Category-Level Attributes for Discriminative Visual Recognition
    Yu, Felix X.
    Cao, Liangliang
    Feris, Rogerio S.
    Smith, John R.
    Chang, Shih-Fu
    [J]. 2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 771 - 778
  • [54] Yu YL, 2020, PROC CVPR IEEE, P14032, DOI 10.1109/CVPR42600.2020.01405
  • [55] CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features
    Yun, Sangdoo
    Han, Dongyoon
    Oh, Seong Joon
    Chun, Sanghyuk
    Choe, Junsuk
    Yoo, Youngjoon
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6022 - 6031
  • [56] Action Coherence Network for Weakly-Supervised Temporal Action Localization
    Zhai, Yuanhao
    Wang, Le
    Tang, Wei
    Zhang, Qilin
    Zheng, Nanning
    Hua, Gang
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1857 - 1870
  • [57] Zhang C.-L., 2020, P IEEE CVF C COMP VI, P13460
  • [58] Weakly Supervised Object Localization and Detection: A Survey
    Zhang, Dingwen
    Han, Junwei
    Cheng, Gong
    Yang, Ming-Hsuan
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (09) : 5866 - 5885
  • [59] From Discriminant to Complete: Reinforcement Searching-Agent Learning for Weakly Supervised Object Detection
    Zhang, Dingwen
    Han, Junwei
    Zhao, Long
    Zhao, Tao
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (12) : 5549 - 5560
  • [60] Learning Object Detectors With Semi-Annotated Weak Labels
    Zhang, Dingwen
    Han, Junwei
    Guo, Guangyu
    Zhao, Long
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (12) : 3622 - 3635