Adversarial Transformers for Weakly Supervised Object Localization

被引:5
|
作者
Meng, Meng [1 ]
Zhang, Tianzhu [1 ]
Zhang, Zhe [2 ,3 ]
Zhang, Yongdong [4 ]
Wu, Feng [4 ,5 ]
机构
[1] Univ Sci & Technol China, Sch Informat Sci & Technol, Dept Automat, Hefei, Peoples R China
[2] Univ Sci & Technol China, Sch Informat Sci & Technol, Dept Automat, Hefei, Peoples R China
[3] Lunar Explorat & Space Engn Ctr CNSA, Beijing, Peoples R China
[4] Univ Sci & Technol China, Sch Informat Sci & Technol, Dept Elect Engn & Informat Sci, Hefei, Peoples R China
[5] Univ Sci & Technol China, Sch Informat Sci & Technol, Dept Elect Engn & Informat Sci, Hefei, Peoples R China
关键词
Perturbation methods; Adversarial training; transformers; weakly supervised object localization; SEMANTIC SEGMENTATION;
D O I
10.1109/TIP.2022.3220055
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Weakly supervised object localization (WSOL) aims at localizing objects with only image-level labels, which has better scalability and practicability than fully supervised methods. However, without pixel-level supervision, existing methods tend to generate rough localization maps, which hinders localization performance. To alleviate this problem, we propose an adversarial transformer network (ATNet), which aims to obtain a well-learned localization model with pixel-level pseudo labels. The proposed ATNet enjoys several merits. First, we design an object transformer ( $G$ ) that can generate localization maps and pseudo labels effectively and dynamically, and a part transformer ( $D$ ) to accurately discriminate detailed local differences between localization maps and pseudo labels. Second, we propose to train $G$ and $D$ via an adversarial process, where $G$ can generate more accurate localization maps approaching pseudo labels to fool $D$ . To the best of our knowledge, this is the first work to explore transformers with adversarial training to obtain a well-learned localization model for WSOL. Extensive experiments with four backbones on two standard benchmarks demonstrate that our ATNet achieves favorable performance against state-of-the-art WSOL methods. Besides, our adversarial training can provide higher robustness against adversarial attacks.
引用
收藏
页码:7130 / 7143
页数:14
相关论文
共 50 条
  • [21] Feature Fusion for Weakly Supervised Object Localization
    Tang, Xu
    Song, Yonghong
    Zhang, Yuanlin
    2018 CHINESE AUTOMATION CONGRESS (CAC), 2018, : 2548 - 2553
  • [22] Convolutional STN for Weakly Supervised Object Localization
    Meethal, Akhil
    Pedersoli, Marco
    Belharbi, Soufiane
    Granger, Eric
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 10157 - 10164
  • [23] Scaling Novel Object Detection with Weakly Supervised Detection Transformers
    LaBonte, Tyler
    Song, Yale
    Wang, Xin
    Vineet, Vibhav
    Joshi, Neel
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 85 - 96
  • [24] Weakly Supervised Object Discovery by Generative Adversarial & Ranking Networks
    Diba, Ali
    Sharma, Vivek
    Stiefelhagen, Rainer
    Van Gool, Luc
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, : 601 - 610
  • [25] Adaptive attention augmentor for weakly supervised object localization
    Zhang, Longhao
    Yang, Huihua
    NEUROCOMPUTING, 2021, 454 : 474 - 482
  • [26] Foreground Activation Maps for Weakly Supervised Object Localization
    Meng, Meng
    Zhang, Tianzhu
    Tian, Qi
    Zhang, Yongdong
    Wu, Feng
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 3365 - 3375
  • [27] Token Masking Transformer for Weakly Supervised Object Localization
    Xu, Wenhao
    Wang, Changwei
    Xu, Rongtao
    Xu, Shibiao
    Meng, Weiliang
    Zhang, Man
    Zhang, Xiaopeng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 2059 - 2069
  • [28] Weakly Supervised Object Localization with Latent Category Learning
    Wang, Chong
    Ren, Weiqiang
    Huang, Kaiqi
    Tan, Tieniu
    COMPUTER VISION - ECCV 2014, PT VI, 2014, 8694 : 431 - 445
  • [29] Rethinking erasing strategy on weakly supervised object localization
    Fan, Yuming
    Wei, Shikui
    Tan, Chuangchuang
    Chen, Xiaotong
    Yang, Dongming
    Zhao, Yao
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2025, 135
  • [30] Aggregation of attention and erasing for weakly supervised object localization
    Koo, Bongyeong
    Choi, Han-Soo
    Kang, Myungjoo
    IMAGE AND VISION COMPUTING, 2023, 129