ICDAR 2003 robust reading competitions: Entries, results, and future directions

被引:178
作者
Lucas S.M. [1 ]
Panaretos A. [1 ]
Sosa L. [1 ]
Tang A. [1 ]
Wong S. [1 ]
Young R. [1 ]
Ashida K. [2 ]
Nagai H. [2 ]
Okamoto M. [2 ]
Yamamoto H. [2 ]
Miyao H. [2 ]
Zhu J. [3 ]
Ou W. [3 ]
Wolf C. [4 ]
Jolion J.-M. [4 ]
Todoran L. [5 ]
Worring M. [5 ]
Lin X. [6 ]
机构
[1] Department of Computer Science, University of Essex
[2] Department of Information Engineering, Faculty of Engineering, Shinshu University, Nagano 380-8553
[3] Institute of Automation, Chinese Academy of Science, Beijing 100080
[4] Lyon Research Center for Images and Intelligent Information Systems, INSA de Lyon, 69621 Villeurbanne Cedex
[5] Informatics Institute, University of Amsterdam, 1098 SJ Amsterdam
[6] Hewlett-Packard Laboratories, Palo Alto, CA 94304
来源
International Journal of Document Analysis and Recognition (IJDAR) | 2005年 / 7卷 / 2-3期
关键词
Camera captured; Reading competition; Text locating;
D O I
10.1007/s10032-004-0134-3
中图分类号
学科分类号
摘要
This paper describes the robust reading competitions for ICDAR 2003. With the rapid growth in research over the last few years on recognizing text in natural scenes, there is an urgent need to establish some common benchmark datasets and gain a clear understanding of the current state of the art. We use the term 'robust reading' to refer to text images that are beyond the capabilities of current commercial OCR packages. We chose to break down the robust reading problem into three subproblems and run competitions for each stage, and also a competition for the best overall system. The subproblems we chose were text locating, character recognition and word recognition. By breaking down the problem in this way, we hoped to gain a better understanding of the state of the art in each of the subproblems. Furthermore, our methodology involved storing detailed results of applying each algorithm to each image in the datasets, allowing researchers to study in depth the strengths and weaknesses of each algorithm. The text-locating contest was the only one to have any entries. We give a brief description of each entry and present the results of this contest, showing cases where the leading entries succeed and fail. We also describe an algorithm for combining the outputs of the individual text locators and show how the combination scheme improves on any of the individual systems. © Springer-Verlag 2005.
引用
收藏
页码:105 / 122
页数:17
相关论文
共 33 条
  • [1] Baird H., Popat K., Human interactive proofs and document image analysis, Proceedings of the 5th IAPR International Workshop on Document Analysis Systems, pp. 507-518, (2002)
  • [2] Baird H.S., Document image defect models and their uses, Proceedings of the 2nd IAPR International Conference on Document Analysis and Recognition, pp. 62-67, (1993)
  • [3] Bezdek J., Pattern recognition with fuzzy objective function algorithms, (1981)
  • [4] Bieber G., Carpenter J., Introduction to service-oriented programming
  • [5] Burges C.J.C., A tutorial on support vector machines for pattern recognition, Data Min. Knowl. Discov., 2, 2, pp. 121-167, (1998)
  • [6] Celenk M., A color clustering technique for image segmentation, Comput. Vis. Graph Image Process, 52, pp. 145-170, (1990)
  • [7] Chang J., Chen X., Hanneman A., Yang J., Waibel A., A robust approach for recognition of text embedded in natural scenes, Proceedings of the International Conference on Pattern Recognition, pp. 204-207, (2002)
  • [8] Clark P., Mirmehdi M., Combining statistical measures to find image text regions, Proceedings of the 15th International Conference on Pattern Recognition, pp. 450-453, (2000)
  • [9] Collobert R., Bengio S., SVMTorch: Support vector machines for large-scale regression problems, J. Mach. Learn. Res., 1, pp. 143-160, (2001)
  • [10] Jain A.K., Yu B., Automatic text location in images and video frame, Pattern Recog., 31, 12, pp. 2055-2076, (1998)