Persian handwritten digit, character and word recognition using deep learning

被引:16
作者
Bonyani, Mahdi [1 ]
Jahangard, Simindokht [2 ]
Daneshmand, Morteza [3 ]
机构
[1] Univ Tabriz, Dept Comp Engn, Tabriz, Iran
[2] Amirkabir Univ Technol, Dept Robot Engn, Tehran, Iran
[3] Univ Tartu, Inst Technol, Tartu, Estonia
关键词
Optical character recognition (OCR); Persian characters and words; Deep neural networks; DenseNet; Xception;
D O I
10.1007/s10032-021-00368-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In spite of various applications of digit, letter and word recognition, only a few studies have dealt with Persian scripts. In this paper, deep neural networks are utilized through different DenseNet and Xception architectures, being further boosted by means of data augmentation and test time augmentation. Dividing the datasets to training, validation and test sets, and utilizing k-fold cross-validation, the comparison of the proposed method with various state-of-the-art alternatives is performed. Three datasets: HODA, Sadri and Iranshahr are used, which offer the most comprehensive collections of samples in terms of handwriting styles and the forms each letter may take depending on its position within a word. On the HODA dataset, we achieve recognition rates of 99.49% and 98.10% for digits and characters, being 99.72%, 89.99% and 98.82% for digits, characters and words from the Sadri dataset, respectively, as well as 98.99% for words from the Iranshahr dataset, each of which outperforms the performances achieved by the most advanced alternative networks, namely ResNet50 and VGG16. An additional contribution of the paper arises from its capability of words recognition as a holistic image classification. This improves the resulting speed and versatility significantly, as it does not require explicit character models, unlike earlier alternatives such as hidden Markov models and convolutional recursive neural networks. In addition, computation times have been compared with alternative state-of-the-art models and better performance has been observed.
引用
收藏
页码:133 / 143
页数:11
相关论文
共 50 条
  • [31] Automated compilation of Urdu poetry handwritten image datasets for optical character recognition
    Ijaz, Irtaza
    Namoun, Abdallah
    Aljohani, Nasser
    Alanazi, Meshari Huwaytim
    Alanazi, Mohammad N.
    Shuja, Junaid
    Humayun, Mohammad Ali
    METHODSX, 2025, 14
  • [32] The Impact of Various Factors on the Convolutional Neural Networks Model on Arabic Handwritten Character Recognition
    Alsayed, Alhag
    Li, Chunlin
    Fat'hAlalim, Ahmed
    Hafiz, Mohammed
    Mohamed, Jihad
    Obied, Zainab
    Abdalsalam, Mohammed
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (05) : 1237 - 1248
  • [33] Recognition of Cotton Plant Diseases Using Deep Learning Architecture
    Haldorai, Anandakumar
    Lincy, R. Babitha
    Suriya, M.
    Balakrishnan, Minu
    Dhanushkumar, K. S.
    2024 5TH INTERNATIONAL CONFERENCE ON INNOVATIVE TRENDS IN INFORMATION TECHNOLOGY, ICITIIT 2024, 2024,
  • [34] Classification of vein pattern recognition using hybrid deep learning
    Gopinath, P.
    Shivakumar, R.
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 43 (05) : 6395 - 6403
  • [35] Time-Frequency Localization Using Deep Convolutional Maxout Neural Network in Persian Speech Recognition
    Dehghani, Arash
    Seyyedsalehi, Seyyed Ali
    NEURAL PROCESSING LETTERS, 2023, 55 (03) : 3205 - 3224
  • [36] CNN-Based Optical Character Recognition for Isolated Printed Gujarati Characters and Handwritten Numerals
    Suthar, Sanket B.
    Thakkar, Amit R.
    INTERNATIONAL JOURNAL OF MATHEMATICAL ENGINEERING AND MANAGEMENT SCIENCES, 2022, 7 (05) : 643 - 655
  • [37] Using Synthetic Images for Deep Learning Recognition Process on Automatic License Plate Recognition
    Barreto, Saulo Cardoso
    Lambert, Jorge Albuquerque
    Vidal, Flavio de Barros
    PATTERN RECOGNITION, MCPR 2019, 2019, 11524 : 115 - 126
  • [38] Time-Frequency Localization Using Deep Convolutional Maxout Neural Network in Persian Speech Recognition
    Arash Dehghani
    Seyyed Ali Seyyedsalehi
    Neural Processing Letters, 2023, 55 : 3205 - 3224
  • [39] Stroke-Based Data Augmentation for Enhancing Optical Character Recognition of Ancient Handwritten Scripts
    Ayyoob, M. P.
    Ilyas, P. Muhamed
    IEEE ACCESS, 2024, 12 : 186794 - 186802
  • [40] OCR-Diff: A Two-Stage Deep Learning Framework for Optical Character Recognition Using Diffusion Model in Industrial Internet of Things
    Park, Chae-Won
    Palakonda, Vikas
    Yun, Sangseok
    Kim, Il-Min
    Kang, Jae-Mo
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (15): : 25997 - 26000