Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer

被引:30
|
作者
Kittenplon, Yair [1 ]
Lavi, Inbal [1 ]
Fogel, Sharon [1 ]
Bar, Yarin [1 ]
Manmatha, R. [1 ]
Perona, Pietro [1 ]
机构
[1] AWS AI Labs, Cambridge, England
关键词
RECOGNITION;
D O I
10.1109/CVPR52688.2022.00456
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text spotting end-to-end methods have recently gained attention in the literature due to the benefits of jointly optimizing the text detection and recognition components. Existing methods usually have a distinct separation between the detection and recognition branches, requiring exact annotations for the two tasks. We introduce TextTranSpotter (EIS), a transformer-based approach for text spotting and the first text spotting framework which may be trained with both fully- and weakly-supervised settings. By learning a single latent representation per word detection, and using a novel loss function based on the Hungarian loss, our method alleviates the need for expensive localization annotations. Trained with only text transcription annotations on real data, our weakly-supervised method achieves competitive performance with previous state-of-the-art fully-supervised methods. When trained in a fully-supervised manner, TextTranSpotter shows state-of-the-art results on multiple benchmarks.
引用
收藏
页码:4594 / 4603
页数:10
相关论文
共 50 条
  • [1] Weakly-Supervised Medical Image Segmentation Based on Multi-task Learning
    Xie, Xuanhua
    Fan, Huijie
    Yu, Zhencheng
    Bai, Haijun
    Tang, Yandong
    INTELLIGENT ROBOTICS AND APPLICATIONS (ICIRA 2022), PT II, 2022, 13456 : 395 - 404
  • [2] Multi-proposal collaboration and multi-task training for weakly-supervised video moment retrieval
    Zhang, Bolin
    Yang, Chao
    Jiang, Bin
    Komamizu, Takahiro
    Ide, Ichiro
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2025,
  • [3] Joint Multi-task Learning Improves Weakly-Supervised Biomarker Prediction in Computational Pathology
    El Nahhas, Omar S. M.
    Woelflein, Georg
    Ligero, Marta
    Lenz, Tim
    van Treeck, Marko
    Khader, Firas
    Truhn, Daniel
    Kather, Jakob Nikolas
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT IV, 2024, 15004 : 254 - 262
  • [4] Multi-Task Weakly-Supervised Attention Network for Dementia Status Estimation With Structural MRI
    Lian, Chunfeng
    Liu, Mingxia
    Wang, Li
    Shen, Dinggang
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (08) : 4056 - 4068
  • [5] CommuSpotter: Scene Text Spotting with Multi-Task Communication
    Zhao, Liang
    Wilsbacher, Greg
    Wang, Song
    APPLIED SCIENCES-BASEL, 2023, 13 (23):
  • [6] Improving Weakly Supervised Lesion Segmentation using Multi-Task Learning
    Chu, Tianshu
    Li, Xinmeng
    Vo, Huy V.
    Summers, Ronald M.
    Sizikova, Elena
    MEDICAL IMAGING WITH DEEP LEARNING, VOL 143, 2021, 143 : 60 - 73
  • [7] Weakly-Supervised Hierarchical Text Classification
    Meng, Yu
    Shen, Jiaming
    Zhang, Chao
    Han, Jiawei
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 6826 - 6833
  • [8] Weakly-Supervised Neural Text Classification
    Meng, Yu
    Shen, Jiaming
    Zhang, Chao
    Han, Jiawei
    CIKM'18: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2018, : 983 - 992
  • [9] Weakly-Supervised Text Instance Segmentation
    Zu, Xinyan
    Yu, Haiyang
    Li, Bin
    Xue, Xiangyang
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 1915 - 1923
  • [10] Weakly-Supervised Alignment of Video With Text
    Bojanowski, P.
    Lajugie, R.
    Grave, E.
    Bach, F.
    Laptev, I.
    Ponce, J.
    Schmid, C.
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4462 - 4470