Optical character recognition (OCR) using partial least square (PLS) based feature reduction: an application to artificial intelligence for biometric identification

被引:16
作者
Akhtar, Zainab [1 ]
Lee, Jong Weon [2 ]
Attique Khan, Muhammad [3 ]
Sharif, Muhammad [1 ]
Ali Khan, Sajid [4 ]
Riaz, Naveed [5 ]
机构
[1] COMSATS Univ Islamabad, Wah Campus, Islamabad, Pakistan
[2] Sejong Univ, Dept Software, Seoul, South Korea
[3] HITEC Univ, Dept Comp Sci, Taxila, Pakistan
[4] Fdn Univ Islamabad, Islamabad, Pakistan
[5] Natl Univ Sci & Technol, SEECS, Islamabad, Pakistan
关键词
Character recognition; ROI extraction; Features fusion; Features selection; Recognition; SYSTEM; REGRESSION; FUSION; VECTOR;
D O I
10.1108/JEIM-02-2020-0076
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Purpose In artificial intelligence, the optical character recognition (OCR) is an active research area based on famous applications such as automation and transformation of printed documents into machine-readable text document. The major purpose of OCR in academia and banks is to achieve a significant performance to save storage space. Design/methodology/approach A novel technique is proposed for automated OCR based on multi-properties features fusion and selection. The features are fused using serially formulation and output passed to partial least square (PLS) based selection method. The selection is done based on the entropy fitness function. The final features are classified by an ensemble classifier. Findings The presented method was extensively tested on two datasets such as the authors proposed and Chars74k benchmark and achieved an accuracy of 91.2 and 99.9%. Comparing the results with existing techniques, it is found that the proposed method gives improved performance. Originality/value The technique presented in this work will help for license plate recognition and text conversion from a printed document to machine-readable.
引用
收藏
页码:767 / 789
页数:23
相关论文
共 56 条
[1]  
Al-Zubaidi E.A., 2019, J SW JIAOTONG U, V54, P1
[2]  
Ali M, 2016, IEEE IMAGE PROC, P2891, DOI 10.1109/ICIP.2016.7532888
[3]   Human Behavior Analysis Based on Multi-Types Features Fusion and Von Nauman Entropy Based Features Reduction [J].
Aurangzeb, Khursheed ;
Haider, Irfan ;
Khan, Muhammad Attique ;
Saba, Tanzila ;
Javed, Kashif ;
Iqbal, Tassawar ;
Rehman, Amjad ;
Ali, Hashim ;
Sarfraz, Muhammad Shahzad .
JOURNAL OF MEDICAL IMAGING AND HEALTH INFORMATICS, 2019, 9 (04) :662-669
[4]  
Ayyaz MN., 2016, PAKISTAN J ENG APPL, V10, P57
[5]   Handwritten Urdu character recognition using one-dimensional BLSTM classifier [J].
Bin Ahmed, Saad ;
Naz, Saeeda ;
Swati, Salahuddin ;
Razzak, Muhammad Imran .
NEURAL COMPUTING & APPLICATIONS, 2019, 31 (04) :1143-1151
[6]   Handwritten Character Recognition Based on the Specificity and the Singularity of the Arabic Language [J].
Boulid, Youssef ;
Souhar, Abdelghani ;
Elkettani, Mohamed Youssfi .
INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2017, 4 (04) :45-53
[7]   Industrial Optical Character Recognition System in Printing Quality Control of Hot-Rolled Coils Identification [J].
Caldeira, Thais ;
Ciarelli, Patrick Marques ;
Auer Neto, Gentil .
JOURNAL OF CONTROL AUTOMATION AND ELECTRICAL SYSTEMS, 2020, 31 (01) :108-118
[8]   A Novel OCR System Based on Rough Set Semi-reduct [J].
Chaudhuri, Ushasi ;
Bhowmick, Partha ;
Mukherjee, Jayanta .
PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2017, 2017, 10597 :263-269
[9]   An intelligent character recognition method to filter spam images on cloud [J].
Chen, Jun ;
Zhao, Hong ;
Yang, Jufeng ;
Zhang, Jian ;
Li, Tao ;
Wang, Kai .
SOFT COMPUTING, 2017, 21 (03) :753-763
[10]   Research on pose estimation for stereo vision measurement system by an improved method: uncertainty weighted stereopsis pose solution method based on projection vector [J].
Cui, Jiashan ;
Min, Changwan ;
Feng, Dongzhu .
OPTICS EXPRESS, 2020, 28 (04) :5470-5491