Line-segment Feature Analysis Algorithm Using Input Dimensionality Reduction for Handwritten Text Recognition

被引:6
作者
Kim, Chang-Min [1 ]
Hong, Ellen J. [2 ]
Chung, Kyungyong [3 ]
Park, Roy C. [4 ]
机构
[1] Sangji Univ, Div Comp Informat Engn, Wonju 26339, South Korea
[2] Yonsei Univ, Dept Comp & Telecommun Engn, Wonju 26493, South Korea
[3] Kyonggi Univ, Div Comp Sci & Engn, Suwon 16227, South Korea
[4] Sangji Univ, Dept Informat Commun Software Engn, Wonju 26339, South Korea
来源
APPLIED SCIENCES-BASEL | 2020年 / 10卷 / 19期
关键词
feature extraction; dimensionality reduction; line-segment features; k-nearest neighbor; support vector machine; CHARACTER-RECOGNITION; FEATURE-SELECTION;
D O I
10.3390/app10196904
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Recently, demand for handwriting recognition, such as automation of mail sorting, license plate recognition, and electronic memo pads, has exponentially increased in various industrial fields. In addition, in the image recognition field, methods using artificial convolutional neural networks, which show outstanding performance, have been applied to handwriting recognition. However, owing to the diversity of recognition application fields, the number of dimensions in the learning and reasoning processes is increasing. To solve this problem, a principal component analysis (PCA) technique is used for dimensionality reduction. However, PCA is likely to increase the accuracy loss due to data compression. Therefore, in this paper, we propose a line-segment feature analysis (LFA) algorithm for input dimensionality reduction in handwritten text recognition. This proposed algorithm extracts the line segment information, constituting the image of input data, and assigns a unique value to each segment using 3 x 3 and 5 x 5 filters. Using the unique values to identify the number of line segments and adding them up, a 1-D vector with a size of 512 is created. This vector is used as input to machine-learning. For the performance evaluation of the method, the Extending Modified National Institute of Standards and Technology (EMNIST) database was used. In the evaluation, PCA showed 96.6% and 93.86% accuracy with k-nearest neighbors (KNN) and support vector machine (SVM), respectively, while LFA showed 97.5% and 98.9% accuracy with KNN and SVM, respectively.
引用
收藏
页码:1 / 17
页数:17
相关论文
共 46 条
[1]   A set of benchmarks for Handwritten Text Recognition on historical documents [J].
Andreu Sanchez, Joan ;
Romero, Veronica ;
Toselli, Alejandro H. ;
Villegas, Mauricio ;
Vidal, Enrique .
PATTERN RECOGNITION, 2019, 94 :122-134
[2]  
[Anonymous], ARXIV13126082
[3]   Optical character recognition for cursive handwriting [J].
Arica, N ;
Yarman-Vural, FT .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (06) :801-813
[4]  
Ashiquzzaman Akm, 2019, Data Management, Analytics and Innovation. Proceedings of ICDMAI 2018. Advances in Intelligent Systems and Computing (AISC 808), P299, DOI 10.1007/978-981-13-1402-5_23
[5]   Context Deep Neural Network Model for Predicting Depression Risk Using Multiple Regression [J].
Baek, Ji-Won ;
Chung, Kyungyong .
IEEE ACCESS, 2020, 8 :18171-18181
[6]   Multiobjective feature selection for microarray data via distributed parallel algorithms [J].
Cao, Bin ;
Zhao, Jianwei ;
Yang, Po ;
Yang, Peng ;
Liu, Xin ;
Qi, Jun ;
Simpson, Andrew ;
Elhoseny, Mohamed ;
Mehmoode, Irfan ;
Muhammad, Khan .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 100 :952-981
[7]   Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning [J].
Coates, Adam ;
Carpenter, Blake ;
Case, Carl ;
Satheesh, Sanjeev ;
Suresh, Bipin ;
Wang, Tao ;
Wu, David J. ;
Ng, Andrew Y. .
11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, :440-445
[8]  
Cohen G, 2017, IEEE IJCNN, P2921, DOI 10.1109/IJCNN.2017.7966217
[9]   The measures precision, recall, fallout and miss as a function of the number of retrieved documents and their mutual interrelations [J].
Egghe, L. .
INFORMATION PROCESSING & MANAGEMENT, 2008, 44 (02) :856-876
[10]   In-air handwritten Chinese text recognition with temporal convolutional recurrent network [J].
Gan, Ji ;
Wang, Weiqiang ;
Lu, Ke .
PATTERN RECOGNITION, 2020, 97