Offline Urdu Nastaleeq Optical Character Recognition Based on Stacked Denoising Autoencoder

被引:0
|
作者
Ahmad, Ibrar [1 ,2 ]
Wang, Xiaojie [1 ]
Li, Ruifan [1 ]
Rasheed, Shahid [3 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Comp Sci, CIST, 10 Xitucheng Rd, Beijing 100876, Peoples R China
[2] Univ Peshawar, Dept Comp Sci, Peshawar 25120, Pakistan
[3] PTCL, Islamabad 44000, Pakistan
基金
中国国家自然科学基金;
关键词
offline printed ligature recognition; urdu nastaleeq; denoising autoencoder; deep learning; classification;
D O I
暂无
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
Offline Urdu Nastaleeq text recognition has long been a serious problem due to its very cursive nature. In order to get rid of the character segmentation problems, many researchers are shifting focus towards segmentation free ligature based recognition approaches. Majority of the prevalent ligature based recognition systems heavily rely on hand-engineered feature extraction techniques. However, such techniques are more error prone and may often lead to a loss of useful information that might hardly be captured later by any manual features. Most of the prevalent Urdu Nastaleeq test recognition was trained and tested on small sets. This paper proposes the use of stacked denoising autoencoder for automatic feature extraction directly from raw pixel values of ligature images. Such deep learning networks have not been applied for the recognition of Urdu text thus far. Different stacked denoising autoencoders have been trained on 178573 ligatures with 3732 classes from un-degraded (noise free) UPTI (Urdu Printed Text Image) data set. Subsequently, trained networks are validated and tested on degraded versions of UPTI data set. The experimental results demonstrate accuracies in range of 93% to 96% which are better than the existing Urdu OCR systems for such large dataset of ligatures.
引用
收藏
页码:146 / 157
页数:12
相关论文
共 50 条
  • [31] Photometric Ligature Extraction Technique for Urdu Optical Character Recognition
    Kazmi, Majida
    Yasir, Fauzia
    Habib, Samreen
    Hayat, Muhammad Saad
    Qazi, Saad Ahmed
    ENGINEERING TECHNOLOGY & APPLIED SCIENCE RESEARCH, 2021, 11 (06) : 7968 - 7973
  • [32] Underwater target recognition method based on t-SNE and stacked nonnegative constrained denoising autoencoder
    Chen, Yuechao
    Xu, Xiaonan
    Zhou, Bin
    Quan, Hengheng
    INDIAN JOURNAL OF GEO-MARINE SCIENCES, 2019, 48 (11): : 1822 - 1832
  • [33] Ligature based Urdu Nastaleeq sentence recognition using gated bidirectional long short term memory
    Ibrar Ahmad
    Xiaojie Wang
    Yuz hao Mao
    Guang Liu
    Haseeb Ahmad
    Rahat Ullah
    Cluster Computing, 2018, 21 : 703 - 714
  • [34] Ligature based Urdu Nastaleeq sentence recognition using gated bidirectional long short term memory
    Ahmad, Ibrar
    Wang, Xiaojie
    Mao, Yuz Hao
    Liu, Guang
    Ahmad, Haseeb
    Ullah, Rahat
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2018, 21 (01): : 703 - 714
  • [35] Robust Feature Extraction for Geochemical Anomaly Recognition Using a Stacked Convolutional Denoising Autoencoder
    Xiong, Yihui
    Zuo, Renguang
    MATHEMATICAL GEOSCIENCES, 2022, 54 (03) : 623 - 644
  • [36] Process Operational State Assessment Based on Stacked Enhanced Denoising Autoencoder
    Feng, Binsheng
    Liu, Yan
    Wang, Fuli
    PROCEEDINGS OF THE 36TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC 2024, 2024, : 5342 - 5347
  • [37] Network intrusion detection based on Contractive Sparse Stacked Denoising Autoencoder
    Lu, Jizhao
    Meng, Huiping
    Li, Wencui
    Liu, Yue
    Guo, Yihao
    Yang, Yang
    2021 IEEE INTERNATIONAL SYMPOSIUM ON BROADBAND MULTIMEDIA SYSTEMS AND BROADCASTING (BMSB), 2021,
  • [38] Robust Feature Extraction for Geochemical Anomaly Recognition Using a Stacked Convolutional Denoising Autoencoder
    Yihui Xiong
    Renguang Zuo
    Mathematical Geosciences, 2022, 54 : 623 - 644
  • [39] Automatic Arrival Time Detection for Earthquakes Based on Stacked Denoising Autoencoder
    Saad, Omar M.
    Inoue, Koji
    Shalaby, Ahmed
    Samy, Lotfy
    Sayed, Mohammed S.
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2018, 15 (11) : 1687 - 1691
  • [40] A Robust Acoustic Feature Extraction Approach Based On Stacked Denoising Autoencoder
    Liu, J. H.
    Zheng, W. Q.
    Zou, Y. X.
    2015 1ST IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2015, : 124 - 127