Automatic localization and extraction of tables from handheld mobile-camera captured handwritten document images

被引:5
作者
Amarnath, R. [1 ]
Sindhushree, G. S. [1 ]
Nagabhushan, P. [1 ,2 ]
Javed, Mohammed [2 ]
机构
[1] Univ Mysore, Dept Studies Comp Sci, Mysore 570006, Karnataka, India
[2] Indian Inst Informat Technol Allahabad, Dept Informat Technol, Allahabad, Uttar Pradesh, India
关键词
Handwritten document images; mobile-cameras; block-wise mean-computed fuzzy based binarization; fast edge-feature extraction; localizing the table;
D O I
10.3233/JIFS-181242
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A table is a compact, effective and structured way of representing information in any document. Automatic localization of tables in scanned handwritten document images, and extracting the information are very critical and challenging task for applications like Optical Character Recognition, handwriting analysis, and auto-evaluation systems. The same task becomes more complex, when the handwritten document images are acquired through handheld mobile-cameras, because the captured images naturally get distorted due to poor illumination, device vibration, camera-angle, camera-orientation, camera-movement, and camera-distance. In this research article, a novel technique of automatic localization and segmentation of tables in handwritten document images which are captured using a handheld mobile-camera is proposed. Generally, ruling lines are used for structuring tables, sketching figures, and scribing scientific equations. In the current research work, tables are detected and extracted based on edge features of the ruling lines subjected to three main stages. Firstly, block-wise mean-computed fuzzy based binarization technique is proposed for analyzing the distortion in the acquired image, and subsequently the background surface that envelops the document area of the image is removed. Secondly, horizontal and vertical granule or strip-based technique is proposed for fast edge-feature extraction from the ruling lines of the table in the binarized image. Finally, entropy quantifiers are employed for segmenting the table in the image. The performance of the proposed technique is evaluated and reported using the proposed composite handwritten benchmark daset. Linear computational benefit 0(h x w) is observed in the worst-case tolerance.
引用
收藏
页码:2527 / 2544
页数:18
相关论文
共 25 条
  • [1] 8357, 2013, LNCS, DOI [10.1007/978-3-319-05167-3, DOI 10.1007/978-3-319-05167-3]
  • [2] A local fuzzy thresholding methodology for multiregion image segmentation
    Aja-Fernandez, Santiago
    Hernan Curiale, Ariel
    Vegas-Sanchez-Ferrero, Gonzalo
    [J]. KNOWLEDGE-BASED SYSTEMS, 2015, 83 : 1 - 12
  • [3] Alaei A., 2011, INT C DOC AN REC
  • [4] A New Binarization Algorithm for Historical Documents
    Almeida, Marcos
    Lins, Rafael Dueire
    Bernardino, Rodrigo
    Jesus, Darlisson
    Lima, Bruno
    [J]. JOURNAL OF IMAGING, 2018, 4 (02)
  • [5] Amarnath R., 2013, THESIS
  • [6] Amarnath R., 2018, 3 INT C COMP VIS IM
  • [7] Amarnath R., 2017, INT J COMPUTER APPL, V172
  • [8] Amarnath R., 2018, INT C ADV COMP COMM
  • [9] [Anonymous], 2007, P 2 INT WORKSH CAM B
  • [10] Bansal A., P 2014 IND C COMP VI