Recognition of offline handwritten Urdu characters using RNN and LSTM models

被引：14

作者：

Misgar, Muzafar Mehraj ^{[1
]}

Mushtaq, Faisel ^{[1
]}

Khurana, Surinder Singh ^{[1
]}

Kumar, Munish ^{[2
]}

机构：

[1] Cent Univ Punjab, Dept Comp Sci & Technol, Bathinda, India

[2] Maharaja Ranjit Singh Punjab Tech Univ, Dept Computat Sci, Bathinda, India

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2023年 / 82卷 / 02期

关键词：

Optical character recognition (OCR); Deep learning (DL); Recurrent neural network (RNN); Long short term memory (LSTM);

D O I：

10.1007/s11042-022-13320-1

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Optical Character Recognition (OCR), helps to convert different types of scanned documents, such as images into searchable and editable content. OCR is language dependant and very limited research has been carried out in this field for Urdu and Urdu like scriptures (E.g. Farsi, Arabic, and Urdu) unlike other languages like English, Hindi, etc. The lack of research work is attributed to a lack of publically available benchmark databases and inherent complexities involved in these languages like cursive nature and change in the shape of a character depending upon its position in a ligature. Each character has 2-4 different shapes depending upon its position in the word; initial, medial, or final. In this article, the we have proposed a methodology to automate the data collection process and collected a large handwritten dataset of 110,785 Urdu characters and laid out the comaparative analysis of two deep learning models SimpleRNN and LSTM to showcase the potential of RNN models for chararacter recognition. Data was collected from 250 authors on the A4 size sheet. Each sheet contains 132 shapes for Urdu characters and 10 numerals. As far as the authors know, this is the first time that such a large dataset has been proposed which contains all the possible shapes of Urdu character numerals as well. Experimentation has been done for the numeral, full characters, and for whole data set separately to lay a comparative analysis of classification capabilities of RNN and LSTM models. Despite of such inherit complexities in Urdu script, the RNN and LSTM models proved to be more effective in achieving a high accuracy rates. Respective accuracy for RNN achieved for each category are: 96.96% for numerals, 85.22% for full characters and 73.62% for whole data and LSTM outperforms the prior one with max accuracy for each category of data as 97.80% for numerals, 97.43% for full characters and 91.30% for whole data. Besides, the proposed dataset opens a new window for future research, showcasing the huge potential of this dataset for data analysis not only for Urdu language but for other languages like Arabic, Persian,etc. which uses similar kind of character sets.

引用

页码：2053 / 2076

页数：24

共 28 条

[1]

Ahmad Z, 2007, PROC WRLD ACAD SCI E, V26, P249

[2]

Ali J., 2014, NUCLEUS, V51, P361

[3]

[Anonymous], 2010, World Acad. Sci., Eng. Technol.

[4]

[Anonymous], 2013, P 18 IBEROAMERICAN C

[5]

Benediktsson JA, 2015, ARTECH HSE REMOTE SE, P1

[6] Handwritten Urdu character recognition using one-dimensional BLSTM classifier [J].

Bin Ahmed, Saad ;

Naz, Saeeda ;

Swati, Salahuddin ;

Razzak, Muhammad Imran .

NEURAL COMPUTING & APPLICATIONS, 2019, 31 (04) :1143-1151

[7] Chinese version of comprehensive early childhood parenting questionnaire (CECPAQ-CV): Factor structure, reliability, and validity [J].

Dong, Shuyang ;

Dubas, Judith Semon ;

Dekovic, Maja ;

Wang, Zhengyan .

CURRENT PSYCHOLOGY, 2023, 42 (19) :16082-16097

[8]

Ebrahinpour R, 2011, INT J ELECT ENG INFO, V3

[9] Arabic OCR Using a Novel Hybrid Classification Scheme [J].

Hafiz, Abdul Mueed ;

Bhat, Ghulam Mohiuddin .

JOURNAL OF PATTERN RECOGNITION RESEARCH, 2016, 11 (01) :55-60

[10]

Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]

← 1 2 3 →