Camera-based analysis of text and documents: A survey

被引:234
作者
Liang J. [1 ]
Doermann D. [1 ]
Li H. [2 ]
机构
[1] Language and Media Processing Laboratory, Institute for Advanced Computer Studies, University of Maryland, College Park, MD
[2] Applied Media Analysis, Inc., Ellicott City, MD
来源
International Journal of Document Analysis and Recognition (IJDAR) | 2005年 / 7卷 / 2-3期
关键词
Imaging Device; Document Image; Cellular Phone; Robust Solution; Multiple Frame;
D O I
10.1007/s10032-004-0138-z
中图分类号
学科分类号
摘要
The increasing availability of high-performance, low-priced, portable digital imaging devices has created a tremendous opportunity for supplementing traditional scanning for document image acquisition. Digital cameras attached to cellular phones, PDAs, or wearable computers, and standalone image or video devices are highly mobile and easy to use; they can capture images of thick books, historical manuscripts too fragile to touch, and text in scenes, making them much more versatile than desktop scanners. Should robust solutions to the analysis of documents captured with such devices become available, there will clearly be a demand in many domains. Traditional scanner-based document analysis techniques provide us with a good reference and starting point, but they cannot be used directly on camera-captured images. Camera-captured images can suffer from low resolution, blur, and perspective distortion, as well as complex layout and interaction of the content and background. In this paper we present a survey of application domains, technical challenges, and solutions for the analysis of documents captured by digital cameras. We begin by describing typical imaging devices and the imaging process. We discuss document analysis from a single camera-captured image as well as multiple frames and highlight some sample applications under development and feasible ideas for future development. © Springer-Verlag 2005.
引用
收藏
页码:84 / 104
页数:20
相关论文
共 117 条
  • [1] Baker S., Kanade T., Limits on super-resolution and how to break them, IEEE Trans. PAMI, 24, 9, pp. 1167-1183, (2002)
  • [2] Bayer B.E., Color image array, US Patent 3971056
  • [3] Bertucci E., Pilu M., Mirmehdi M., Text selection by structured light marking for hand-held cameras, Proc. ICDAR, pp. 555-559, (2003)
  • [4] Brown L.G., A survey of image registration techniques, ACM Comput. Surv., 24, 4, pp. 325-376, (1992)
  • [5] Brown M.S., Seales W.B., Document restoration using 3D shape: A general deskewing algorithm for arbitrarily warped documents, Proc. ICCV, pp. 367-374, (2001)
  • [6] Cai M., Song J.-Q., Lyu M.R., A new approach for video text detection, Proc. ICIP, pp. 117-120, (2002)
  • [7] Cao H.-G., Ding X.-Q., Liu C.-S., Rectifying the bound document image captured by the camera: A model based approach, Proc. ICDAR, pp. 71-75, (2003)
  • [8] Chang S.L., Chen L.S., Chung Y.C., Chen S.W., Automatic license plate recognition, IEEE Trans. Intell. Transport. Syst., 5, 1, pp. 42-53, (2004)
  • [9] Capel D., Zisserman A., Super-resolution enhancement of text image sequences, Proc. ICPR, pp. 600-605, (2000)
  • [10] Chen D., Shearer K., Bourlard H., Text enhancement with asymmetric filter for video OCR, Proc. ICDAR, pp. 192-197, (2001)