Robust Unsupervised Segmentation of Degraded Document Images with Topic Models

被引:0
作者
Burns, Timothy J. [1 ]
Corso, Jason J. [1 ]
机构
[1] SUNY Buffalo, Buffalo, NY 14260 USA
来源
CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4 | 2009年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Segmentation of document images remains a challenging vision problem. Although document images have a structured layout, capturing enough of it for segmentation can be difficult. Most current methods combine text extraction and heuristics for segmentation, but text extraction is prone to failure and measuring accuracy remains a difficult challenge. Furthermore, when presented with significant degradation many common heuristic methods fall apart. In this paper, we propose a Bayesian generative model for document images which seeks to overcome some of these drawbacks. Our model automatically discovers different regions present in a document image in a completely unsupervised fashion. We attempt no text extraction, but rather use discrete patch-based codebook learning to make our probabilistic representation feasible. Each latent region topic is a distribution over these patch indices. We capture rough document layout with an M R F Potts model. We take an analysis-by-synthesis approach to examine the model, and provide quantitative segmentation results on a manually-labeled document image data set. We illustrate our model's robustness by providing results on a highly degraded version of our test set.
引用
收藏
页码:1287 / 1294
页数:8
相关论文
共 50 条
[31]   Unsupervised segmentation of hyperspectral images [J].
Lee, Sangwook ;
Lee, Chulhee .
SATELLITE DATA COMPRESSION, COMMUNICATION, AND PROCESSING IV, 2008, 7084
[32]   Unsupervised segmentation based on robust estimation and color active contour models [J].
Yang, L ;
Meer, P ;
Foran, DJ .
IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE, 2005, 9 (03) :475-486
[33]   Unsupervised segmentation of defect images [J].
Iivarinen, J .
INTELLIGENT ROBOTS AND COMPUTER VISION XX: ALGORITHMS, TECHNIQUES, AND ACTIVE VISION, 2001, 4572 :488-495
[34]   Unsupervised segmentation of SAR images [J].
Guo, GD ;
Ma, SD .
IGARSS '98 - 1998 INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, PROCEEDINGS VOLS 1-5: SENSING AND MANAGING THE ENVIRONMENT, 1998, :1150-1152
[35]   Unsupervised segmentation of textured images [J].
Park, JY ;
Kurz, L .
INFORMATION SCIENCES, 1996, 92 (1-4) :255-276
[36]   Unsupervised segmentation of natural images [J].
Dai, XY ;
Maeda, J .
OPTICAL REVIEW, 2002, 9 (05) :197-201
[37]   Unsupervised Segmentation of Natural Images [J].
Xiao Yan Dai ;
Junji Maeda .
Optical Review, 2002, 9 :197-201
[38]   Deep Autoencoding Models for Unsupervised Anomaly Segmentation in Brain MR Images [J].
Baur, Christoph ;
Wiestler, Benedikt ;
Albarqouni, Shadi ;
Navab, Nassir .
BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES, BRAINLES 2018, PT I, 2019, 11383 :161-169
[39]   Transformer Based Models for Unsupervised Anomaly Segmentation in Brain MR Images [J].
Ghorbel, Ahmed ;
Aldahdooh, Ahmed ;
Albarqouni, Shadi ;
Hamidouche, Wassim .
BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES, BRAINLES 2022, 2023, 13769 :25-44
[40]   Topic Models with Topic Ordering Regularities for Topic Segmentation [J].
Du, Lan ;
Pate, John K. ;
Johnson, Mark .
2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2014, :803-808