Text-Attentional Convolutional Neural Network for Scene Text Detection

被引:233
|
作者
He, Tong [1 ,2 ]
Huang, Weilin [1 ,3 ]
Qiao, Yu [1 ,3 ]
Yao, Jian [2 ]
机构
[1] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen 518055, Peoples R China
[2] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan 430072, Peoples R China
[3] Chinese Univ Hong Kong, Multimedia Lab, Hong Kong, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Maximally stable extremal regions; text detector; convolutional neural networks; multi-level supervised information; multi-task learning; READING TEXT; LOCALIZATION;
D O I
10.1109/TIP.2016.2547588
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent deep learning models have demonstrated strong capabilities for classifying text and non-text components in natural images. They extract a high-level feature globally computed from a whole image component (patch), where the cluttered background information may dominate true text features in the deep representation. This leads to less discriminative power and poorer robustness. In this paper, we present a new system for scene text detection by proposing a novel text-attentional convolutional neural network (Text-CNN) that particularly focuses on extracting text-related regions and features from the image components. We develop a new learning mechanism to train the Text-CNN with multi-level and rich supervised information, including text region mask, character label, and binary text/non-text information. The rich supervision information enables the Text-CNN with a strong capability for discriminating ambiguous texts, and also increases its robustness against complicated background components. The training process is formulated as a multi-task learning problem, where low-level supervised information greatly facilitates the main task of text/non-text classification. In addition, a powerful low-level detector called contrast-enhancement maximally stable extremal regions (MSERs) is developed, which extends the widely used MSERs by enhancing intensity contrast between text patterns and background. This allows it to detect highly challenging text patterns, resulting in a higher recall. Our approach achieved promising results on the ICDAR 2013 data set, with an F-measure of 0.82, substantially improving the state-of-the-art results.
引用
收藏
页码:2529 / 2541
页数:13
相关论文
共 50 条
  • [21] Pelee-Text plus plus : A Tiny Neural Network for Scene Text Detection
    Cordova, Manuel
    Pinto, Allan
    Pedrini, Helio
    Torres, Ricardo da Silva
    IEEE ACCESS, 2020, 8 : 223172 - 223188
  • [22] Deep Residual Text Detection Network for Scene Text
    Zhu, Xiangyu
    Jiang, Yingying
    Yang, Shuli
    Wang, Xiaobing
    Li, Wei
    Fu, Pei
    Wang, Hua
    Luo, Zhenbo
    2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 807 - 812
  • [23] CRF based text detection for natural scene images using convolutional neural network and context information
    Wang, Yanna
    Shi, Cunzhao
    Xiao, Baihua
    Wang, Chunheng
    Qi, Chengzuo
    NEUROCOMPUTING, 2018, 295 : 46 - 58
  • [24] Thai Text Localization in Natural Scene Images using Convolutional Neural Network
    Kobchaisawat, Thananop
    Chalidabhongse, Thanarat H.
    2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
  • [25] Thai Text Detection and Classification Using Convolutional Neural Network
    Malakar, Susanta
    Chiracharit, Werapon
    2020 59TH ANNUAL CONFERENCE OF THE SOCIETY OF INSTRUMENT AND CONTROL ENGINEERS OF JAPAN (SICE), 2020, : 99 - 102
  • [26] A ROBUST HIERARCHICAL DETECTION METHOD FOR SCENE TEXT BASED ON CONVOLUTIONAL NEURAL NETWORKS
    Xu, Hailiang
    Su, Feng
    2015 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO (ICME), 2015,
  • [27] SCENE TEXT RECOGNITION WITH DEEPER CONVOLUTIONAL NEURAL NETWORKS
    Zhang, Yuqi
    Wang, Wei
    Wang, Liang
    Wang, Liuan
    2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 2384 - 2388
  • [28] Text detection with convolutional neural networks
    Delakis, Manolis
    Garcia, Christophe
    VISAPP 2008: PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, VOL 2, 2008, : 290 - 294
  • [29] Natural Scene Text Detection Based on Deep Supervised Fully Convolutional Network
    Zhang, Nan
    Jin, Xiaoning
    Li, Xiaowei
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT III, 2018, 11166 : 439 - 448
  • [30] TEXNET: A DEEP CONVOLUTIONAL NEURAL NETWORK MODEL TO RECOGNIZE TEXT IN NATURAL SCENE IMAGES
    KAVITHA, D.
    RADHA, V.
    JOURNAL OF ENGINEERING SCIENCE AND TECHNOLOGY, 2021, 16 (02): : 1782 - 1799