Bag of Local Convolutional Triplets for Script Identification in Scene Text

被引：9

作者：

Zdenek, Jan ^{[1
]}

Nakayama, Hideki ^{[1
]}

机构：

[1] Univ Tokyo, Grad Sch Informat Sci & Technol, Tokyo, Japan

来源：

2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1 | 2017年

关键词：

script identification; scene text; convolutional neural networks; bag-of-visual words;

D O I：

10.1109/ICDAR.2017.68

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The increasing interest in scene text reading in multilingual environments raises the need to recognize and distinguish between different writing systems. In this paper, we propose a novel method for script identification in scene text using triplets of local convolutional features in combination with the traditional bag-of-visual-words model. Feature triplets are created by making combinations of descriptors extracted from local patches of the input images using a convolutional neural network. This approach allows us to generate a more descriptive codeword dictionary for the bag-of-visual-words model, as the low discriminative power of weak descriptors is enhanced by other descriptors in a triplet. The proposed method is evaluated on two public benchmark datasets for scene text script identification and a public dataset for script identification in video captions. The experiments demonstrate that our method outperforms the baseline and yields competitive results on all three datasets.

引用

页码：369 / 375

页数：7

共 50 条

[21] SARN: Script-Aware Recognition Network for scene multilingual text recognition
Ke, Wenjun
Hou, Qingzhi
Liu, Yutian
Song, Xinyue
Wei, Jianguo
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 250
[22] Script identification in the wild via discriminative convolutional neural network
Shi, Baoguang
Bai, Xiang
Yao, Cong
PATTERN RECOGNITION, 2016, 52 : 448 - 458
[23] A fast hierarchical method for multi-script and arbitrary oriented scene text extraction
Lluis Gomez
Dimosthenis Karatzas
International Journal on Document Analysis and Recognition (IJDAR), 2016, 19 : 335 - 349
[24] A survey on camera-captured scene text detection and extraction: towards Gurmukhi script
Kaur A.
Dhir R.
Lehal G.S.
International Journal of Multimedia Information Retrieval, 2017, 6 (2) : 115 - 142
[25] A fast hierarchical method for multi-script and arbitrary oriented scene text extraction
Gomez, Lluis
Karatzas, Dimosthenis
INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2016, 19 (04) : 335 - 349
[26] ICDAR2017 Robust Reading Challenge on Multi-lingual Scene Text Detection and Script Identification - RRC-MLT
Nayef, Nibal
Yin, Fei
Bizid, Imen
Choi, Hyunsoo
Feng, Yuan
Karatzas, Dimosthenis
Luo, Zhenbo
Pal, Umapada
Rigaud, Christophe
Chazalon, Joseph
Khlif, Wafa
Luqman, Muhammad Muzzamil
Burie, Jean-Christophe
Liu, Cheng-Lin
Ogier, Jean-Marc
2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 1454 - 1459
[27] Attention and Language Ensemble for Scene Text Recognition with Convolutional Sequence Modeling
Fang, Shancheng
Xie, Hongtao
Zha, Zheng-Jun
Sun, Nannan
Tan, Jianlong
Zhang, Yongdong
PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, : 248 - 256
[28] Script identification in handwritten and printed documents using convolutional recurrent connection
Jindal A.
Multimedia Tools and Applications, 2025, 84 (9) : 5549 - 5563
[29] The Design of a Script Identification Algorithm and Its Application in Constructing a Text Language Identification Dataset
Qasim, Mamtimin
Silamu, Wushour
Qiu, Minghui
DATA, 2024, 9 (11)
[30] Scene Classification Based on Local Binary Pattern and Improved Bag of Visual Words
Montazer, Gholam Ali
Giveki, Davar
Soltanshahi, Mohammad Ali
ADVANCES IN COMPUTATIONAL INTELLIGENCE, PT I (IWANN 2015), 2015, 9094 : 241 - 251

← 1 2 3 4 5 →