Page segmentation using minimum homogeneity algorithm and adaptive mathematical morphology

被引:0
|
作者
Tuan Anh Tran
In Seop Na
Soo Hyung Kim
机构
[1] Chonnam National University,School of Electronic and Computer Engineering
来源
International Journal on Document Analysis and Recognition (IJDAR) | 2016年 / 19卷
关键词
Page segmentation; Document layout analysis; Homogeneity structure; OCR; Mathematical morphology; Recursive filter;
D O I
暂无
中图分类号
学科分类号
摘要
Document layout analysis or page segmentation is the task of decomposing document images into many different regions such as texts, images, separators, and tables. It is still a challenging problem due to the variety of document layouts. In this paper, we propose a novel hybrid method, which includes three main stages to deal with this problem. In the first stage, the text and non-text elements are classified by using minimum homogeneity algorithm. This method is the combination of connected component analysis and multilevel homogeneity structure. Then, in the second stage, a new homogeneity structure is combined with an adaptive mathematical morphology in the text document to get a set of text regions. Besides, on the non-text document, further classification of non-text elements is applied to get separator regions, table regions, image regions, etc. The final stage, in refinement region and noise detection process, all regions both in the text document and non-text document are refined to eliminate noises and get the geometric layout of each region. The proposed method has been tested with the dataset of ICDAR2009 page segmentation competition and many other databases with different languages. The results of these tests showed that our proposed method achieves a higher accuracy compared to other methods. This proves the effectiveness and superiority of our method.
引用
收藏
页码:191 / 209
页数:18
相关论文
共 50 条
  • [1] Page segmentation using minimum homogeneity algorithm and adaptive mathematical morphology
    Tuan Anh Tran
    Na, In Seop
    Kim, Soo Hyung
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2016, 19 (03) : 191 - 209
  • [2] Hybrid Page Segmentation using Multilevel Homogeneity Structure
    Tuan Anh Tran
    Na, In-Seop
    Kim, Soo-Hyung
    ACM IMCOM 2015, PROCEEDINGS, 2015,
  • [3] An adaptive over-split and merge algorithm for page segmentation
    Ha Dai-Ton
    Nguyen Duc-Dung
    Le Duc-Hieu
    PATTERN RECOGNITION LETTERS, 2016, 80 : 137 - 143
  • [4] Microarray Image Segmentation Using Region Growing Algorithm and Mathematical Morphology
    Ye, Ping
    Weng, Guirong
    FIFTH INTERNATIONAL CONFERENCE ON INFORMATION ASSURANCE AND SECURITY, VOL 2, PROCEEDINGS, 2009, : 373 - 376
  • [5] Retinal Vessel Segmentation Using Parallel Grayscale Skeletonization Algorithm and Mathematical Morphology
    Rodrigues, Jardel
    Bezerra, Nivando
    2016 29TH SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 2016, : 17 - 24
  • [6] Lung Nodule Segmentation using Mathematical Morphology
    Qu, Mingzhi
    Weng, Guirong
    INFORMATION TECHNOLOGY FOR MANUFACTURING SYSTEMS II, PTS 1-3, 2011, 58-60 : 1378 - 1383
  • [7] A new algorithm for speckle suppression using mathematical morphology and adaptive weighted technique
    Jiang, Li-Hui
    Jin, Zhen-Ni
    Zhang, Fan
    Liu, Rui-Hua
    PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2007, : 2427 - 2430
  • [8] Segmentation of Moving Objects Using Mathematical Morphology
    LU Guan-ming
    BI Hou-jie(Department of Information Engineering. Naming University of Posts and Telecommunications. Naming
    The Journal of China Universities of Posts and Telecommunications, 1999, (02) : 63 - 66
  • [9] Face segmentation using mathematical morphology on single faces
    Vargas-Vazquez, D.
    Rodriguez-Resendiz, J.
    Jimenez-Sanchez, A. R.
    Bocarando-Chacon, J-G
    2016 12TH CONGRESO INTERNACIONAL DE INGENIER (CONIIN), 2016,
  • [10] Segmentation of Medical Images using Fuzzy Mathematical Morphology
    Bouchet, A.
    Pastore, J.
    Ballarin, V.
    JOURNAL OF COMPUTER SCIENCE & TECHNOLOGY, 2007, 7 (03): : 256 - 262