Frame selection for OCR from video stream of book flipping

被引：0

作者：

Dibyayan Chakraborty

Partha Pratim Roy

Rajkumar Saini

Jose M. Alvarez

Umapada Pal

机构：

[1] ISI Kolkata,Computer Vision and Pattern Recognition Unit

[2] IIT Roorkee,Department of Computer Science and Engineering

[3] Canberra Research Lab,undefined

[4] ACT,undefined

来源：

Multimedia Tools and Applications | 2018年 / 77卷

关键词：

Video OCR; OCR of flipping book; Video document image;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Optical Character Recognition (OCR) in video stream of flipping pages is a challenging task because flipping at random speed causes difficulties in identifying the frames that contain the open page image (OPI). Also, low resolution, blurring effect, shadow, etc., add significant noise in selection of proper frames for OCR. In this paper, we focus on identifying a set of representative frames from the video stream of flipping pages without using any explicit hardware and then perform OCR on these frames for recognition. Thus, an end-to-end solution is proposed for video stream of flipping pages. To select an OPI, we present an efficient algorithm that exploits cues from edge information during flipping event. These cues, extracted from the region of interest (ROI) of the frame, determine the flipping or open state of a page. The open state classification is performed by an SVM classifier following training of the edge cue information. After selecting a set of frames for each OPI, a representative frame from OPI set is chosen for OCR. Experiments are performed on videos captured using standard resolution camera. We have obtained 88.81 % accuracy on representative frame selection from the proposed method whereas when compared with GIST (Oliva and Torralba, Int J Comput Vis 42(3):145–175 (2001)), the accuracy was only 51.28 %. To the best of our knowledge this is the first work in this area. After frame selection, we have achieved 83.31 % character recognition accuracy and 78.11 % word recognition accuracy with traditional OCR in our dataset of flipping book.

引用

页码：985 / 1008

页数：23

共 29 条

[11]

Mäenpää T(2014)Robust text detection in natural scene images IEEE Trans Pattern Anal Mach Intell 36 970-983

[12]

Oliva A(2015)Multi-orientation scene text detection with adaptive clustering IEEE Trans Pattern Anal Mach Intell 37 1930-1937

[13]

Torralba A(undefined)undefined undefined undefined undefined-undefined

[14]

Otsu N(undefined)undefined undefined undefined undefined-undefined

[15]

Sauvola J(undefined)undefined undefined undefined undefined-undefined

[16]

Pietikäinen M(undefined)undefined undefined undefined undefined-undefined

[17]

Su B(undefined)undefined undefined undefined undefined-undefined

[18]

Lu S(undefined)undefined undefined undefined undefined-undefined

[19]

Tan CL(undefined)undefined undefined undefined undefined-undefined

[20]

Yi C(undefined)undefined undefined undefined undefined-undefined

← 1 2 3 →