Text Prior Guided Scene Text Image Super-Resolution

被引:31
作者
Ma, Jianqi [1 ]
Guo, Shi [1 ]
Zhang, Lei [1 ]
机构
[1] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China
关键词
Scene text image super-resolution; super-resolution; text prior; NETWORK; RECOGNITION;
D O I
10.1109/TIP.2023.3237002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scene text image super-resolution (STISR) aims to improve the resolution and visual quality of low-resolution (LR) scene text images, while simultaneously boost the performance of text recognition. However, most of the existing STISR methods regard text images as natural scene images, ignoring the categorical information of text. In this paper, we make an inspiring attempt to embed text recognition prior into STISR model. Specifically, we adopt the predicted character recognition probability sequence as the text prior, which can be obtained conveniently from a text recognition model. The text prior provides categorical guidance to recover high-resolution (HR) text images. On the other hand, the reconstructed HR image can refine the text prior in return. Finally, we present a multi-stage text prior guided super-resolution (TPGSR) framework for STISR. Our experiments on the benchmark TextZoom dataset show that TPGSR can not only effectively improve the visual quality of scene text images, but also significantly improve the text recognition accuracy over existing STISR methods. Our model trained on TextZoom also demonstrates certain generalization capability to the LR images in other datasets. The source code of our work is available
引用
收藏
页码:1341 / 1353
页数:13
相关论文
共 50 条
  • [21] Bayesian super-resolution of text in video with a text-specific bimodal prior
    Donaldson K.
    Myers G.K.
    International Journal of Document Analysis and Recognition (IJDAR), 2005, 7 (2-3): : 159 - 167
  • [22] Multi-Task Learning for Scene Text Image Super-Resolution with Multiple Transformers
    Honda, Kosuke
    Kurematsu, Masaki
    Fujita, Hamido
    Selamat, Ali
    ELECTRONICS, 2022, 11 (22)
  • [23] DCDM: Diffusion-Conditioned-Diffusion Model for Scene Text Image Super-Resolution
    Singh, Shrey
    Keserwani, Prateek
    Iwamura, Masakazu
    Roy, Partha Pratim
    COMPUTER VISION - ECCV 2024, PT XV, 2025, 15073 : 303 - 320
  • [24] Navigating Style Variations in Scene Text Image Super-Resolution through Multi-Scale Perception
    Xu, Feifei
    Yu, Ziheng
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 229 - 238
  • [25] Better Skeleton Better Readability: Scene Text Image Super-Resolution via Skeleton-Aware Diffusion Model
    Singh, Shrey
    Keserwani, Prateek
    Roy, Partha Pratim
    Saini, Rajkumar
    IEEE ACCESS, 2024, 12 : 187640 - 187651
  • [26] Pragmatic degradation learning for scene text image super-resolution with data-training strategy
    Yang, Shengying
    Xie, Lifeng
    Ran, Xiaoxiao
    Lei, Jingsheng
    Qian, Xiaohong
    KNOWLEDGE-BASED SYSTEMS, 2024, 285
  • [27] Scene text image super-resolution via textual reasoning and multiscale cross-convolution
    Yu, Lan
    Li, Xiaojie
    Yu, Qi
    Li, Guangju
    Jin, Dehu
    Qi, Meng
    APPLIED INTELLIGENCE, 2024, 54 (02) : 1997 - 2008
  • [28] Real Scene Text Image Super-Resolution Based on Multi-Scale and Attention Fusion
    Lu, Xinhua
    Wei, Haihai
    Ma, Li
    Xue, Qingji
    Fu, Yonghui
    JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2023, 19 (04): : 427 - 438
  • [29] Scene text image super-resolution via textual reasoning and multiscale cross-convolution
    Lan Yu
    Xiaojie Li
    Qi Yu
    Guangju Li
    Dehu Jin
    Meng Qi
    Applied Intelligence, 2024, 54 : 1997 - 2008
  • [30] Parametric loss-based super-resolution for scene text recognition
    Supatta Viriyavisuthisakul
    Parinya Sanguansat
    Teeradaj Racharak
    Minh Le Nguyen
    Natsuda Kaothanthong
    Choochart Haruechaiyasak
    Toshihiko Yamasaki
    Machine Vision and Applications, 2023, 34