Generalized Weakly Supervised Object Localization

被引:19
作者
Zhang, Dingwen [1 ,2 ]
Guo, Guangyu [1 ]
Zeng, Wenyuan [1 ]
Li, Lei [3 ]
Han, Junwei [2 ]
机构
[1] Northwestern Polytech Univ, Sch Automat, Brain & Artificial Intelligence Lab, Xian 710129, Peoples R China
[2] Hefei Comprehens Natl Sci Ctr, Inst Artificial Intelligence, Hefei 230026, Peoples R China
[3] Sci & Technol Complex Syst Control & Intelligent, Beijing 100074, Peoples R China
基金
美国国家科学基金会;
关键词
Location awareness; Semantics; Annotations; Task analysis; Feature extraction; Training; Manuals; Object localization; unseen object category; weakly supervised learning;
D O I
10.1109/TNNLS.2022.3204337
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the goal of learning to localize specific object semantics using the low-cost image-level annotation, weakly supervised object localization (WSOL) has been receiving increasing attention in recent years. Although existing literatures have studied a number of major issues in this field, one important yet challenging scenario, where the test object semantics may appear in the training phase (seen categories) or never been observed before (unseen categories), is still beyond the exploration of the existing works. We define this scenario as the generalized WSOL (GWSOL) and make a pioneering effort to study it in this article. By leveraging attribute vectors to associate seen and unseen categories, we involve threefold modeling components, i.e., the class-sensitive modeling, semantic-agnostic modeling, and content-aware modeling, into a unified end-to-end learning framework. Such design enables our model to recognize and localize unconstrained object semantics, learn compact and discriminative features that could represent the potential unseen categories, and customize content-aware attribute weights to avoid localizing on misleading attribute elements. To advance this research direction, we contribute the bounding-box manual annotations to the widely used AwA2 dataset and benchmark the GWSOL methods. Comprehensive experiments demonstrate the effectiveness of our proposed learning framework and each of the considered modeling components.
引用
收藏
页码:5395 / 5406
页数:12
相关论文
共 69 条
  • [61] ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
    Zhang, Xiangyu
    Zhou, Xinyu
    Lin, Mengxiao
    Sun, Ran
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6848 - 6856
  • [62] Self-produced Guidance for Weakly-Supervised Object Localization
    Zhang, Xiaolin
    Wei, Yunchao
    Kang, Guoliang
    Yang, Yi
    Huang, Thomas
    [J]. COMPUTER VISION - ECCV 2018, PT XII, 2018, 11216 : 610 - 625
  • [63] A Large-scale Attribute Dataset for Zero-shot Learning
    Zhao, Bo
    Fu, Yanwei
    Liang, Rui
    Wu, Jiahong
    Wang, Yonggang
    Wang, Yizhou
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, : 398 - 407
  • [64] Pyramid Scene Parsing Network
    Zhao, Hengshuang
    Shi, Jianping
    Qi, Xiaojuan
    Wang, Xiaogang
    Jia, Jiaya
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6230 - 6239
  • [65] SODA: Weakly Supervised Temporal Action Localization Based on Astute Background Response and Self-Distillation Learning
    Zhao, Tao
    Han, Junwei
    Yang, Le
    Wang, Binglu
    Zhang, Dingwen
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (08) : 2474 - 2498
  • [66] Weakly Supervised Video Salient Object Detection
    Zhao, Wangbo
    Zhang, Jing
    Li, Long
    Barnes, Nick
    Liu, Nian
    Han, Junwei
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 16821 - 16830
  • [67] Learning Deep Features for Discriminative Localization
    Zhou, Bolei
    Khosla, Aditya
    Lapedriza, Agata
    Oliva, Aude
    Torralba, Antonio
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2921 - 2929
  • [68] SAL:Selection and Attention Losses for Weakly Supervised Semantic Segmentation
    Zhou, Lei
    Gong, Chen
    Liu, Zhi
    Fu, Keren
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 1035 - 1048
  • [69] Soft Proposal Networks for Weakly Supervised Object Localization
    Zhu, Yi
    Zhou, Yanzhao
    Ye, Qixiang
    Qiu, Qiang
    Jiao, Jianbin
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1859 - 1868