Text-Attentional Convolutional Neural Network for Scene Text Detection

被引:233
|
作者
He, Tong [1 ,2 ]
Huang, Weilin [1 ,3 ]
Qiao, Yu [1 ,3 ]
Yao, Jian [2 ]
机构
[1] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen 518055, Peoples R China
[2] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan 430072, Peoples R China
[3] Chinese Univ Hong Kong, Multimedia Lab, Hong Kong, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Maximally stable extremal regions; text detector; convolutional neural networks; multi-level supervised information; multi-task learning; READING TEXT; LOCALIZATION;
D O I
10.1109/TIP.2016.2547588
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent deep learning models have demonstrated strong capabilities for classifying text and non-text components in natural images. They extract a high-level feature globally computed from a whole image component (patch), where the cluttered background information may dominate true text features in the deep representation. This leads to less discriminative power and poorer robustness. In this paper, we present a new system for scene text detection by proposing a novel text-attentional convolutional neural network (Text-CNN) that particularly focuses on extracting text-related regions and features from the image components. We develop a new learning mechanism to train the Text-CNN with multi-level and rich supervised information, including text region mask, character label, and binary text/non-text information. The rich supervision information enables the Text-CNN with a strong capability for discriminating ambiguous texts, and also increases its robustness against complicated background components. The training process is formulated as a multi-task learning problem, where low-level supervised information greatly facilitates the main task of text/non-text classification. In addition, a powerful low-level detector called contrast-enhancement maximally stable extremal regions (MSERs) is developed, which extends the widely used MSERs by enhancing intensity contrast between text patterns and background. This allows it to detect highly challenging text patterns, resulting in a higher recall. Our approach achieved promising results on the ICDAR 2013 data set, with an F-measure of 0.82, substantially improving the state-of-the-art results.
引用
收藏
页码:2529 / 2541
页数:13
相关论文
共 50 条
  • [1] A Novel Scene Text Detection Algorithm Based On Convolutional Neural Network
    Ren, Xiaohang
    Chen, Kai
    Yang, Xiaokang
    Zhou, Yi
    He, Jianhua
    Sun, Jun
    2016 30TH ANNIVERSARY OF VISUAL COMMUNICATION AND IMAGE PROCESSING (VCIP), 2016,
  • [2] Scene text detection with fully convolutional neural networks
    Zhandong Liu
    Wengang Zhou
    Houqiang Li
    Multimedia Tools and Applications, 2019, 78 : 18205 - 18227
  • [3] Scene text detection with fully convolutional neural networks
    Liu, Zhandong
    Zhou, Wengang
    Li, Houqiang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (13) : 18205 - 18227
  • [4] TEXT-ATTENTIONAL CONDITIONAL GENERATIVE ADVERSARIAL NETWORK FOR SUPER-RESOLUTION OF TEXT IMAGES
    Wang, Yuyang
    Su, Feng
    Qian, Ye
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1024 - 1029
  • [5] Multi-Scale Scene Text Detection Based on Convolutional Neural Network
    Lu, Yan-Feng
    Zhang, Ai-Xuan
    Li, Yi
    Yu, Qian-Hui
    Qiao, Hong
    2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 583 - 587
  • [6] Text Detection in Low Resolution Scene Images Using Convolutional Neural Network
    Risnumawan, Anhar
    Sulistijono, Indra Adji
    Abawajy, Jemal
    RECENT ADVANCES ON SOFT COMPUTING AND DATA MINING, 2017, 549 : 366 - 375
  • [7] Scene text detection using enhanced Extremal region and convolutional neural network
    Fatemeh Naiemi
    Vahid Ghods
    Hassan Khalesi
    Multimedia Tools and Applications, 2020, 79 : 27137 - 27159
  • [8] Video Text Detection with Text Edges and Convolutional Neural Network
    Hu, Ping
    Wang, Weiqiang
    Lu, Ke
    PROCEEDINGS 3RD IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION ACPR 2015, 2015, : 675 - 679
  • [9] A New Unsupervised Convolutional Neural Network Model for Chinese Scene Text Detection
    Ren, Xiaohang
    Chen, Kai
    Yang, Xiaokang
    Zhou, Yi
    He, Jianhua
    Sun, Jun
    2015 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING, 2015, : 428 - 432
  • [10] Aksara Jawa Text Detection in Scene Images using Convolutional Neural Network
    Afakh, Muhammad Labiyb
    Risnumawan, Anhar
    Anggraeni, Martianda Erste
    Tamara, Mohamad Nasyir
    Ningrum, Endah Suryawati
    2017 INTERNATIONAL ELECTRONICS SYMPOSIUM ON KNOWLEDGE CREATION AND INTELLIGENT COMPUTING (IES-KCIC), 2017, : 77 - 82