Accurate, data-efficient, unconstrained text recognition with convolutional neural networks

被引:62
|
作者
Yousef, Mohamed [1 ]
Hussain, Khaled F. [1 ]
Mohammed, Usama S. [2 ]
机构
[1] Assiut Univ, Fac Comp & Informat, Comp Sci Dept, Asyut 71515, Egypt
[2] Assiut Univ, Elect Engn Dept, Fac Engn, Asyut 71515, Egypt
关键词
Text recognition; Optical character recognition; Handwriting recognition; CAPTCHA Solving; License plate recognition; Convolutional neural network; Deep learning; SCENE TEXT; LSTM;
D O I
10.1016/j.patcog.2020.107482
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Unconstrained text recognition is an important computer vision task, featuring a wide variety of different sub-tasks, each with its own set of challenges. One of the biggest promises of deep neural networks has been the convergence and automation of feature extractors from input raw signals, allowing for the highest possible performance with minimum required domain knowledge. To this end, we propose a data-efficient, end-to-end neural network model for generic, unconstrained text recognition. In our proposed architecture we strive for simplicity and efficiency without sacrificing recognition accuracy. Our proposed architecture is a fully convolutional network without any recurrent connections trained with the CTC loss function. Thus it operates on arbitrary input sizes and produces strings of arbitrary length in a very efficient and parallelizable manner. We show the generality and superiority of our proposed text recognition architecture by achieving state-of-the-art results on seven public benchmark datasets, covering a wide spectrum of text recognition tasks, namely: Handwriting Recognition, CAPTCHA recognition, OCR, License Plate Recognition, and Scene Text Recognition. Our proposed architecture has won the ICFHR2018 Competition on Automated Text Recognition on a READ Dataset. (C) 2020 Published by Elsevier Ltd.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Human Activity Recognition Using Convolutional Neural Networks
    Dogan, Gulustan
    Ertas, Sinem Sena
    Cay, Iremnaz
    2021 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY (CIBCB), 2021, : 76 - 80
  • [22] Sign Language Recognition Using Convolutional Neural Networks
    Pigou, Lionel
    Dieleman, Sander
    Kindermans, Pieter-Jan
    Schrauwen, Benjamin
    COMPUTER VISION - ECCV 2014 WORKSHOPS, PT I, 2015, 8925 : 572 - 578
  • [23] Convolutional Neural Networks for Phoneme Recognition
    Glackin, Cornelius
    Wall, Julie
    Chollet, Gerard
    Dugan, Nazim
    Cannings, Nigel
    PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS (ICPRAM 2018), 2018, : 190 - 195
  • [24] Exploring data augmentation for Amazigh speech recognition with convolutional neural networks
    Hossam Boulal
    Farida Bouroumane
    Mohamed Hamidi
    Jamal Barkani
    Mustapha Abarkan
    International Journal of Speech Technology, 2025, 28 (1) : 53 - 65
  • [25] Convolutional Recurrent Neural Networks for Text Classification
    Lyu, Shengfei
    Liu, Jiaqi
    JOURNAL OF DATABASE MANAGEMENT, 2021, 32 (04) : 65 - 82
  • [26] Semi-Automated X-ray Transmission Image Annotation Using Data-efficient Convolutional Neural Networks and Cooperative Machine Learning
    Jabari, Ofentse
    Ayalew, Yirsaw
    Motshegwa, Tshiamo
    2021 THE 5TH INTERNATIONAL CONFERENCE ON VIDEO AND IMAGE PROCESSING, ICVIP 2021, 2021, : 205 - 214
  • [27] Distilling GRU with Data Augmentation for Unconstrained Handwritten Text Recognition
    Liu, Manfei
    Xie, Zecheng
    Huang, YaoXiong
    Jin, Lianwen
    Zhou, Weiyin
    PROCEEDINGS 2018 16TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2018, : 56 - 61
  • [28] Spiking Deep Convolutional Neural Networks for Energy-Efficient Object Recognition
    Yongqiang Cao
    Yang Chen
    Deepak Khosla
    International Journal of Computer Vision, 2015, 113 : 54 - 66
  • [29] Spiking Deep Convolutional Neural Networks for Energy-Efficient Object Recognition
    Cao, Yongqiang
    Chen, Yang
    Khosla, Deepak
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 113 (01) : 54 - 66
  • [30] Efficient and accurate microplastics identification and segmentation in urban waters using convolutional neural networks
    Xu, Jiongji
    Wang, Zhaoli
    SCIENCE OF THE TOTAL ENVIRONMENT, 2024, 911