Self-supervised Implicit Glyph Attention for Text Recognition

被引:13
|
作者
Guan, Tongkun [1 ]
Gu, Chaochen [2 ]
Tu, Jingzheng [2 ]
Yang, Xue [1 ]
Feng, Qi [2 ]
Zhao, Yudi [2 ]
Shen, Wei [1 ]
机构
[1] Shanghai Jiao Tong Univ, AI Inst, MoE Key Lab Artificial Intelligence, Shanghai, Peoples R China
[2] Shanghai Jiao Tong Univ, Dept Automat, Shanghai, Peoples R China
基金
上海市自然科学基金;
关键词
NETWORK;
D O I
10.1109/CVPR52729.2023.01467
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The attention mechanism has become the de facto module in scene text recognition (STR) methods, due to its capability of extracting character-level representations. These methods can be summarized into implicit attention based and supervised attention based, depended on how the attention is computed, i.e., implicit attention and supervised attention are learned from sequence-level text annotations and or character-level bounding box annotations, respectively. Implicit attention, as it may extract coarse or even incorrect spatial regions as character attention, is prone to suffering from an alignment-drifted issue. Supervised attention can alleviate the above issue, but it is character category-specific, which requires extra laborious character-level bounding box annotations and would be memory-intensive when handling languages with larger character categories. To address the aforementioned issues, we propose a novel attention mechanism for STR, self-supervised implicit glyph attention (SIGA). SIGA delineates the glyph structures of text images by jointly self-supervised text segmentation and implicit attention alignment, which serve as the supervision to improve attention correctness without extra character-level annotations. Experimental results demonstrate that SIGA performs consistently and significantly better than previous attention-based STR methods, in terms of both attention correctness and final recognition performance on publicly available context benchmarks and our contributed contextless benchmarks.
引用
收藏
页码:15285 / 15294
页数:10
相关论文
共 50 条
  • [1] Self-supervised adaptation for on-line text recognition
    Oudot, L
    Prevost, L
    Moises, A
    NINTH INTERNATIONAL WORKSHOP ON FRONTIERS IN HANDWRITING RECOGNITION, PROCEEDINGS, 2004, : 9 - 13
  • [2] Perturbation-Based Self-Supervised Attention for Attention Bias in Text Classification
    Feng, Huawen
    Lin, Zhenxi
    Ma, Qianli
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 3139 - 3151
  • [3] Scene Text Recognition with Self-supervised Contrastive Predictive Coding
    Jiang, Xinzhe
    Zhang, Jianshu
    Du, Jun
    Zhang, Zhenrong
    Wu, Jiajia
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 1514 - 1521
  • [4] Self-supervised Character-to-Character Distillation for Text Recognition
    Guan, Tongkun
    Shen, Wei
    Yang, Xue
    Feng, Qi
    Jiang, Zekun
    Yang, Xiaokang
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 19416 - 19427
  • [5] Topic attention encoder: A self-supervised approach for short text clustering
    Jin, Jian
    Zhao, Haiyuan
    Ji, Ping
    JOURNAL OF INFORMATION SCIENCE, 2022, 48 (05) : 701 - 717
  • [6] Reading and Writing: Discriminative and Generative Modeling for Self-Supervised Text Recognition
    Yang, Mingkun
    Liao, Minghui
    Lu, Pu
    Wang, Jing
    Zhu, Shenggao
    Luo, Hualin
    Tian, Qi
    Bai, Xiang
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4214 - 4223
  • [7] Self-supervised regularization for text classification
    Zhou M.
    Li Z.
    Xie P.
    Transactions of the Association for Computational Linguistics, 2021, 9 : 1147 - 1162
  • [8] Self-supervised Regularization for Text Classification
    Zhou, Meng
    Li, Zechen
    Xie, Pengtao
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2021, 9 : 641 - 656
  • [9] Improving BERT With Self-Supervised Attention
    Chen, Yiren
    Kou, Xiaoyu
    Bai, Jiangang
    Tong, Yunhai
    IEEE ACCESS, 2021, 9 : 144129 - 144139
  • [10] Text-DIAE: A Self-Supervised Degradation Invariant Autoencoder for Text Recognition and Document Enhancement
    Souibgui, Mohamed Ali
    Biswas, Sanket
    Mafla, Andres
    Biten, Ali Furkan
    Fornes, Alicia
    Kessentini, Yousri
    Llados, Josep
    Gomez, Lluis
    Karatzas, Dimosthenis
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 2, 2023, : 2330 - 2338