Text-Attentional Convolutional Neural Network for Scene Text Detection

被引:233
|
作者
He, Tong [1 ,2 ]
Huang, Weilin [1 ,3 ]
Qiao, Yu [1 ,3 ]
Yao, Jian [2 ]
机构
[1] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen 518055, Peoples R China
[2] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan 430072, Peoples R China
[3] Chinese Univ Hong Kong, Multimedia Lab, Hong Kong, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Maximally stable extremal regions; text detector; convolutional neural networks; multi-level supervised information; multi-task learning; READING TEXT; LOCALIZATION;
D O I
10.1109/TIP.2016.2547588
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent deep learning models have demonstrated strong capabilities for classifying text and non-text components in natural images. They extract a high-level feature globally computed from a whole image component (patch), where the cluttered background information may dominate true text features in the deep representation. This leads to less discriminative power and poorer robustness. In this paper, we present a new system for scene text detection by proposing a novel text-attentional convolutional neural network (Text-CNN) that particularly focuses on extracting text-related regions and features from the image components. We develop a new learning mechanism to train the Text-CNN with multi-level and rich supervised information, including text region mask, character label, and binary text/non-text information. The rich supervision information enables the Text-CNN with a strong capability for discriminating ambiguous texts, and also increases its robustness against complicated background components. The training process is formulated as a multi-task learning problem, where low-level supervised information greatly facilitates the main task of text/non-text classification. In addition, a powerful low-level detector called contrast-enhancement maximally stable extremal regions (MSERs) is developed, which extends the widely used MSERs by enhancing intensity contrast between text patterns and background. This allows it to detect highly challenging text patterns, resulting in a higher recall. Our approach achieved promising results on the ICDAR 2013 data set, with an F-measure of 0.82, substantially improving the state-of-the-art results.
引用
收藏
页码:2529 / 2541
页数:13
相关论文
共 50 条
  • [31] Text Detection and Recognition for Natural Scene Images Using Deep Convolutional Neural Networks
    Wu, Xianyu
    Luo, Chao
    Zhang, Qian
    Zhou, Jiliu
    Yang, Hao
    Li, Yulian
    CMC-COMPUTERS MATERIALS & CONTINUA, 2019, 61 (01): : 289 - 300
  • [32] A Deep Convolutional Deblurring and Detection Neural Network for Localizing Text in Videos
    Wang, Yang
    Qian, Ye
    Shi, Jiahao
    Su, Feng
    MULTIMEDIA MODELING (MMM 2020), PT II, 2020, 11962 : 112 - 124
  • [33] Detection of medical text semantic similarity based on convolutional neural network
    Tao Zheng
    Yimei Gao
    Fei Wang
    Chenhao Fan
    Xingzhi Fu
    Mei Li
    Ya Zhang
    Shaodian Zhang
    Handong Ma
    BMC Medical Informatics and Decision Making, 19
  • [34] Detection of medical text semantic similarity based on convolutional neural network
    Zheng, Tao
    Gao, Yimei
    Wang, Fei
    Fan, Chenhao
    Fu, Xingzhi
    Li, Mei
    Zhang, Ya
    Zhang, Shaodian
    Ma, Handong
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2019, 19 (01)
  • [35] Live detection of text in the natural environment using Convolutional Neural Network
    Francis, Leena Mary
    Sreenath, N.
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 98 : 444 - 455
  • [36] Text Enhancement Network for Cross-Domain Scene Text Detection
    Deng, Jinhong
    Luo, Xiulian
    Zheng, Jiawen
    Dang, Wanli
    Li, Wen
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2203 - 2207
  • [37] A Convolutional Neural Network-Based Chinese Text Detection Algorithm via Text Structure Modeling
    Ren, Xiaohang
    Zhou, Yi
    He, Jianhua
    Chen, Kai
    Yang, Xiaokang
    Sun, Jun
    IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (03) : 506 - 518
  • [38] Convolutional Neural Network Based Text Steganalysis
    Wen, Juan
    Zhou, Xuejing
    Zhong, Ping
    Xue, Yiming
    IEEE SIGNAL PROCESSING LETTERS, 2019, 26 (03) : 460 - 464
  • [39] Transformable Convolutional Neural Network for Text Classification
    Xiao, Liqiang
    Zhang, Honglun
    Chen, Wenqing
    Wang, Yongkun
    Jin, Yaohui
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 4496 - 4502
  • [40] Scene Text Script Identification with Convolutional Recurrent Neural Networks
    Mei, Jieru
    Dai, Luo
    Shi, Baoguang
    Bai, Xiang
    2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 4053 - 4058