Persian handwritten digit, character and word recognition using deep learning

被引:16
作者
Bonyani, Mahdi [1 ]
Jahangard, Simindokht [2 ]
Daneshmand, Morteza [3 ]
机构
[1] Univ Tabriz, Dept Comp Engn, Tabriz, Iran
[2] Amirkabir Univ Technol, Dept Robot Engn, Tehran, Iran
[3] Univ Tartu, Inst Technol, Tartu, Estonia
关键词
Optical character recognition (OCR); Persian characters and words; Deep neural networks; DenseNet; Xception;
D O I
10.1007/s10032-021-00368-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In spite of various applications of digit, letter and word recognition, only a few studies have dealt with Persian scripts. In this paper, deep neural networks are utilized through different DenseNet and Xception architectures, being further boosted by means of data augmentation and test time augmentation. Dividing the datasets to training, validation and test sets, and utilizing k-fold cross-validation, the comparison of the proposed method with various state-of-the-art alternatives is performed. Three datasets: HODA, Sadri and Iranshahr are used, which offer the most comprehensive collections of samples in terms of handwriting styles and the forms each letter may take depending on its position within a word. On the HODA dataset, we achieve recognition rates of 99.49% and 98.10% for digits and characters, being 99.72%, 89.99% and 98.82% for digits, characters and words from the Sadri dataset, respectively, as well as 98.99% for words from the Iranshahr dataset, each of which outperforms the performances achieved by the most advanced alternative networks, namely ResNet50 and VGG16. An additional contribution of the paper arises from its capability of words recognition as a holistic image classification. This improves the resulting speed and versatility significantly, as it does not require explicit character models, unlike earlier alternatives such as hidden Markov models and convolutional recursive neural networks. In addition, computation times have been compared with alternative state-of-the-art models and better performance has been observed.
引用
收藏
页码:133 / 143
页数:11
相关论文
共 50 条
  • [41] A robust handwritten recognition system for learning on different data restriction scenarios
    Neto, Arthur Flor de Sousa
    Bezerra, Byron Leite Dantas
    Toselli, Alejandro Hector
    Lima, Estanislau Baptista
    PATTERN RECOGNITION LETTERS, 2022, 159 : 232 - 238
  • [42] Sensor Positioning and Data Acquisition for Activity Recognition using Deep Learning
    Chung, Seungeun
    Lim, Jiyoun
    Noh, Kyoung Ju
    Kim, Ga Gue
    Jeong, Hyun Tae
    2018 INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC), 2018, : 154 - 159
  • [43] Multi-channel underwater target recognition using deep learning
    Li, Chen
    Huang, Zhaoqiong
    Xu, Ji
    Guo, Xinyi
    Gong, Zaixiao
    Yan, Yonghong
    Yan, Yonghong (yanyonghong@hccl.ioa.ac.cn), 1600, Science Press (45): : 506 - 514
  • [44] RECOGNITION OF ARABIC HANDWRITTEN CHARACTERS USING RESIDUAL NEURAL NETWORKS
    Al-Taani, Ahmad T.
    Ahmad, Sadeem T.
    JORDANIAN JOURNAL OF COMPUTERS AND INFORMATION TECHNOLOGY, 2021, 7 (02): : 192 - 205
  • [45] Multiclass Recognition of Offline Handwritten Devanagari Characters using CNN
    Bisht, Mamta
    Gupta, Richa
    INTERNATIONAL JOURNAL OF MATHEMATICAL ENGINEERING AND MANAGEMENT SCIENCES, 2020, 5 (06) : 1429 - 1439
  • [46] Mathematical representation of emotion using multimodal recognition model with deep multitask learning
    Harata S.
    Sakuma T.
    Kato S.
    Harata, Seiichi (harata@katolab.nitech.ac.jp), 1600, Institute of Electrical Engineers of Japan (140): : 1343 - 1351
  • [47] Speech Emotion Recognition Using Deep Neural Network and Extreme Learning Machine
    Han, Kun
    Yu, Dong
    Tashev, Ivan
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 223 - 227
  • [48] Low-Effort Place Recognition with WiFi Fingerprints Using Deep Learning
    Nowicki, Michal
    Wietrzykowski, Jan
    AUTOMATION 2017: INNOVATIONS IN AUTOMATION, ROBOTICS AND MEASUREMENT TECHNIQUES, 2017, 550 : 575 - 584
  • [49] Using Feature Visualisation for Explaining Deep Learning Models in Visual Speech Recognition
    Santos, Timothy Israel
    Abel, Andrew
    2019 4TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS (ICBDA 2019), 2019, : 231 - 235
  • [50] Handwritten Arabic Optical Character Recognition Approach Based on Hybrid Whale Optimization Algorithm With Neighborhood Rough Set
    Sahlol, Ahmed Talat
    Abd Elaziz, Mohamed
    Al-Qaness, Mohammed A. A.
    Kim, Sunghwan
    IEEE ACCESS, 2020, 8 : 23011 - 23021