Key-text spotting in documentary videos using Adaboost

被引：1

作者：

Lalonde, M. ^{[1
]}

Gagnon, L. ^{[1
]}

机构：

[1] CRIM, R&D Dept, 550 Sherbrooke W,Suite 100, Montreal, PQ H3A 1B9, Canada

来源：

IMAGE PROCESSING: ALGORITHMS AND SYSTEMS, NEURAL NETWORKS, AND MACHINE LEARNING | 2006年 / 6064卷

基金：

加拿大自然科学与工程研究理事会;

关键词：

text detection; Adaboost; video indexing; multimedia systems;

D O I：

10.1117/12.641924

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents a method for spotting key-text in videos, based on a cascade of classifiers trained with Adaboost. The video is first reduced to a set of key-frames. Each key-frame is then analyzed for its text content. Text spotting is performed by scanning the image with a variable-size window (to account for scale) within which simple features (mean/variance of grayscale values and x/y derivatives) are extracted in various sub-areas. Training builds classifiers using the most discriminant spatial combinations of features for text detection. The text-spotting module outputs a decision map of the size of the input key-frame showing regions of interest that may contain text suitable for recognition by an OCR system. Performance is measured against a dataset of 147 key-frames extracted from 22 documentary films of the National Film Board (NFB) of Canada. A detection rate of 97% is obtained with relatively few false alarms.

引用

页数：8

共 13 条

[1] [Anonymous], P AUD AND VID BAS BI
[2] Chen XR, 2004, PROC CVPR IEEE, P366
[3] DLAGNEKOV L, 2005, THESIS UCSD
[4] Freund Y, 1996, ICML
[5] FURHT B, 2004, HDB VIDEO DATABASES
[6] GAGNON L, 2005, SPIE, V6015, P341
[7] GAGNON L, 2003, SPIE, V5304, P319
[8] Lienhart R, 2002, IEEE IMAGE PROC, P900
[9] Localizing and segmenting text in images and videos
Lienhart, R
Wernicke, A
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2002, 12 (04) : 256 - 268
[10] Video OCR: indexing digital news libraries by recognition of superimposed captions
Sato, T
Kanade, T
Hughes, EK
Smith, MA
Satoh, S
[J]. MULTIMEDIA SYSTEMS, 1999, 7 (05) : 385 - 395

← 1 2 →