Text proposals with location-awareness-attention network for arbitrarily shaped scene text detection and recognition

被引:12
作者
Zhong, Dajian [1 ]
Lyu, Shujing [1 ,2 ]
Shivakumara, Palaiahankote [3 ]
Pal, Umapada [4 ]
Lu, Yue [1 ,2 ]
机构
[1] East China Normal Univ, Shanghai Key Lab Multidimens Informat Proc, Shanghai 200241, Peoples R China
[2] East China Normal Univ, Sch Commun & Elect Engn, Shanghai 200241, Peoples R China
[3] Univ Malaya, Fac Comp Sci & Informat Technol FSKTM, Kuala Lumpur 50603, Malaysia
[4] Indian Stat Inst, CVPR Unit, Kolkata 700108, India
基金
中国国家自然科学基金;
关键词
Scene text detection; Scene text recognition; Text proposal; Attention model; Location-awareness-attention model; NEURAL-NETWORK; IMAGE;
D O I
10.1016/j.eswa.2022.117564
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Unlike existing models that aim to address the challenge of scene text detection and recognition separately, the proposed work aims to address both text detection and recognition using a single architecture to deal with arbitrarily oriented/shaped text. Towards this aim, a novel Text Proposal with Location-AwarenessAttention Network (TPLAANet) for arbitrarily oriented/shaped text detection and recognition is proposed. For text detection, the proposed method explores central mask prediction for locating text instances, bounding box regression branch for tight bounding boxes, and mask branch for accurate positions of arbitrarily oriented/shaped text instances. For text recognition, the proposed method explores character information using a Location-Awareness-Attention Network (LAAN), which learns a two-dimensional attention weight and improves the recognition performance. To test the efficacy of the proposed model, we consider the commonly used horizontal and multi-oriented natural scene text datasets, namely, ICDAR2013, ICDAR2015, and the arbitrarily shaped scene text datasets, namely, Total-Text and CTW1500 for experimentation. Experimental results are provided to validate the effectiveness of the proposed method. The code is available at: https: //codeocean.com/capsule/5666319/tree/v1.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] SLOAN: Scale-Adaptive Orientation Attention Network for Scene Text Recognition
    Dai, Pengwen
    Zhang, Hua
    Cao, Xiaochun
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 1687 - 1701
  • [32] MORAN: A Multi-Object Rectified Attention Network for scene text recognition
    Luo, Canjie
    Jin, Lianwen
    Sun, Zenghui
    PATTERN RECOGNITION, 2019, 90 : 109 - 118
  • [33] STAN: A sequential transformation attention-based network for scene text recognition
    Lin, Qingxiang
    Luo, Canjie
    Jin, Lianwen
    Lai, Songxuan
    PATTERN RECOGNITION, 2021, 111
  • [34] Supervised Attention Network for Arbitrary-Shaped Text Detection in Edge-Fainted Noisy Scene Images
    Soni, Aishwarya
    Dutta, Tanima
    Nigam, Nitika
    Verma, Deepali
    Gupta, Hari Prabhat
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2023, 10 (03) : 1179 - 1188
  • [35] FACLSTM: ConvLSTM with focused attention for scene text recognition
    Wang, Qingqing
    Huang, Ye
    Jia, Wenjing
    He, Xiangjian
    Blumenstein, Michael
    Lyu, Shujing
    Lu, Yue
    SCIENCE CHINA-INFORMATION SCIENCES, 2020, 63 (02)
  • [36] Attention Guided Feature Encoding for Scene Text Recognition
    Hassan, Ehtesham
    Lekshmi, V. L.
    JOURNAL OF IMAGING, 2022, 8 (10)
  • [37] FACLSTM: ConvLSTM with focused attention for scene text recognition
    Qingqing Wang
    Ye Huang
    Wenjing Jia
    Xiangjian He
    Michael Blumenstein
    Shujing Lyu
    Yue Lu
    Science China Information Sciences, 2020, 63
  • [38] FACLSTM:ConvLSTM with focused attention for scene text recognition
    Qingqing WANG
    Ye HUANG
    Wenjing JIA
    Xiangjian HE
    Michael BLUMENSTEIN
    Shujing LYU
    Yue LU
    ScienceChina(InformationSciences), 2020, 63 (02) : 35 - 48
  • [39] Sequential alignment attention model for scene text recognition
    Wu, Yan
    Fan, Jiaxin
    Tao, Renshuai
    Wang, Jiakai
    Qin, Haotong
    Liu, Aishan
    Liu, Xianglong
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2021, 80
  • [40] Detection and rectification of arbitrary shaped scene texts by using text keypoints and links
    Xue, Chuhui
    Lu, Shijian
    Hoi, Steven
    PATTERN RECOGNITION, 2022, 124