Accurate, data-efficient, unconstrained text recognition with convolutional neural networks

被引:62
|
作者
Yousef, Mohamed [1 ]
Hussain, Khaled F. [1 ]
Mohammed, Usama S. [2 ]
机构
[1] Assiut Univ, Fac Comp & Informat, Comp Sci Dept, Asyut 71515, Egypt
[2] Assiut Univ, Elect Engn Dept, Fac Engn, Asyut 71515, Egypt
关键词
Text recognition; Optical character recognition; Handwriting recognition; CAPTCHA Solving; License plate recognition; Convolutional neural network; Deep learning; SCENE TEXT; LSTM;
D O I
10.1016/j.patcog.2020.107482
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Unconstrained text recognition is an important computer vision task, featuring a wide variety of different sub-tasks, each with its own set of challenges. One of the biggest promises of deep neural networks has been the convergence and automation of feature extractors from input raw signals, allowing for the highest possible performance with minimum required domain knowledge. To this end, we propose a data-efficient, end-to-end neural network model for generic, unconstrained text recognition. In our proposed architecture we strive for simplicity and efficiency without sacrificing recognition accuracy. Our proposed architecture is a fully convolutional network without any recurrent connections trained with the CTC loss function. Thus it operates on arbitrary input sizes and produces strings of arbitrary length in a very efficient and parallelizable manner. We show the generality and superiority of our proposed text recognition architecture by achieving state-of-the-art results on seven public benchmark datasets, covering a wide spectrum of text recognition tasks, namely: Handwriting Recognition, CAPTCHA recognition, OCR, License Plate Recognition, and Scene Text Recognition. Our proposed architecture has won the ICFHR2018 Competition on Automated Text Recognition on a READ Dataset. (C) 2020 Published by Elsevier Ltd.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Intelligent character recognition using fully convolutional neural networks
    Ptucha, Raymond
    Such, Felipe Petroski
    Pillai, Suhas
    Brockler, Frank
    Singh, Vatsala
    Hutkowski, Paul
    PATTERN RECOGNITION, 2019, 88 : 604 - 613
  • [42] Convolutional and Recurrent Neural Networks for Activity Recognition in Smart Environment
    Singh, Deepika
    Merdivan, Erinc
    Hanke, Sten
    Kropf, Johannes
    Geist, Matthieu
    Holzinger, Andreas
    TOWARDS INTEGRATIVE MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2017, 10344 : 194 - 205
  • [43] Recognition of Assembly Parts by Convolutional Neural Networks
    Zidek, Kamil
    Hosovsky, Alexander
    Pitel', Jan
    Bednar, Slavomir
    ADVANCES IN MANUFACTURING ENGINEERING AND MATERIALS, ICMEM 2018, 2019, : 281 - 289
  • [44] Ensemble Convolutional Neural Networks for Face Recognition
    Cheng, Wen-Chang
    Wu, Tin-Yu
    Li, Dai-Wei
    2018 INTERNATIONAL CONFERENCE ON ALGORITHMS, COMPUTING AND ARTIFICIAL INTELLIGENCE (ACAI 2018), 2018,
  • [45] ARCHITECTURE RECOGNITION BY MEANS OF CONVOLUTIONAL NEURAL NETWORKS
    Andrianaivo, Louis N.
    D'Autilia, Roberto
    Palma, Valerio
    27TH CIPA INTERNATIONAL SYMPOSIUM: DOCUMENTING THE PAST FOR A BETTER FUTURE, 2019, 42-2 (W15): : 77 - 84
  • [46] Convolutional neural network with joint stepwise character/word modeling based system for scene text recognition
    Riadh Harizi
    Rim Walha
    Fadoua Drira
    Mourad Zaied
    Multimedia Tools and Applications, 2022, 81 : 3091 - 3106
  • [47] Convolutional neural networks for ship type recognition
    Rainey, Katie
    Reeder, John D.
    Corelli, Alexander G.
    AUTOMATIC TARGET RECOGNITION XXVI, 2016, 9844
  • [48] TOWARD AIRCRAFT RECOGNITION WITH CONVOLUTIONAL NEURAL NETWORKS
    Mash, Robert
    Becherer, Nicholas
    Woolley, Brian
    Pecarina, John
    PROCEEDINGS OF THE 2016 IEEE NATIONAL AEROSPACE AND ELECTRONICS CONFERENCE (NAECON) AND OHIO INNOVATION SUMMIT (OIS), 2016, : 225 - 232
  • [49] Convolutional neural network with joint stepwise character/word modeling based system for scene text recognition
    Harizi, Riadh
    Walha, Rim
    Drira, Fadoua
    Zaied, Mourad
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (03) : 3091 - 3106
  • [50] Convolutional Neural Networks and Face Recognition Task
    Sochenkova, A.
    Sochenkov, I.
    Makovetskii, A.
    Vokhmintsev, A.
    Melnikov, A.
    APPLICATIONS OF DIGITAL IMAGE PROCESSING XL, 2017, 10396