A multiresolution approach for page segmentation

被引:23
作者
Cinque, L
Lombardi, L
Manzini, G
机构
[1] Univ Roma La Sapienza, Dipartimento Sci Informazione, I-00198 Rome, Italy
[2] Univ Pavia, Dipartimento Informat & Sistemistica, I-27100 Pavia, Italy
关键词
document analysis; multiresolution; page segmentation;
D O I
10.1016/S0167-8655(97)00169-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work we propose a new page segmentation method for recognizing text and graphics based on a multiresolution representation of the page images. Our approach is based on the analysis of a set of feature maps available at different resolution levels. The final output is a description of the physical structure of a page. A page image is broken down into several blocks which represent components of a page, such as text, line-drawings, and pictures. The result, which uses only a small amount of memory in addition to that for the image, may be the first step for a more detailed analysis such as optical character recognition.
引用
收藏
页码:217 / 225
页数:9
相关论文
共 16 条
[1]  
AKINDELE OT, 1993, P 2 INT C DOC AN REC, P91
[2]  
Fisher J. L., 1990, Proceedings. 10th International Conference on Pattern Recognition (Cat. No.90CH2898-5), P567, DOI 10.1109/ICPR.1990.118166
[3]  
HARALICK RM, 1994, 1994 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS, P385, DOI 10.1109/CVPR.1994.323855
[4]  
*ICDAR 1, 1991, 1ST INT C DOCUMENT
[5]  
*ICDAR 2, 1993, 2ND INT C DOCUMENT
[6]  
Ittner D. J., 1993, Proceedings of the Second International Conference on Document Analysis and Recognition (Cat. No.93TH0578-5), P336, DOI 10.1109/ICDAR.1993.395720
[7]   Page segmentation using texture analysis [J].
Jain, AK ;
Zhong, Y .
PATTERN RECOGNITION, 1996, 29 (05) :743-770
[8]  
KASTURI R, 1992, DOCUMENT IMAGE ANALY
[9]  
MANZINI G, 1996, THESIS U PAVIA
[10]   A PROTOTYPE DOCUMENT IMAGE-ANALYSIS SYSTEM FOR TECHNICAL JOURNALS [J].
NAGY, G ;
SETH, S ;
VISWANATHAN, M .
COMPUTER, 1992, 25 (07) :10-22