Bag of Local Convolutional Triplets for Script Identification in Scene Text

被引:9
|
作者
Zdenek, Jan [1 ]
Nakayama, Hideki [1 ]
机构
[1] Univ Tokyo, Grad Sch Informat Sci & Technol, Tokyo, Japan
来源
2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1 | 2017年
关键词
script identification; scene text; convolutional neural networks; bag-of-visual words;
D O I
10.1109/ICDAR.2017.68
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The increasing interest in scene text reading in multilingual environments raises the need to recognize and distinguish between different writing systems. In this paper, we propose a novel method for script identification in scene text using triplets of local convolutional features in combination with the traditional bag-of-visual-words model. Feature triplets are created by making combinations of descriptors extracted from local patches of the input images using a convolutional neural network. This approach allows us to generate a more descriptive codeword dictionary for the bag-of-visual-words model, as the low discriminative power of weak descriptors is enhanced by other descriptors in a triplet. The proposed method is evaluated on two public benchmark datasets for scene text script identification and a public dataset for script identification in video captions. The experiments demonstrate that our method outperforms the baseline and yields competitive results on all three datasets.
引用
收藏
页码:369 / 375
页数:7
相关论文
共 50 条
  • [21] SARN: Script-Aware Recognition Network for scene multilingual text recognition
    Ke, Wenjun
    Hou, Qingzhi
    Liu, Yutian
    Song, Xinyue
    Wei, Jianguo
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 250
  • [22] Script identification in the wild via discriminative convolutional neural network
    Shi, Baoguang
    Bai, Xiang
    Yao, Cong
    PATTERN RECOGNITION, 2016, 52 : 448 - 458
  • [23] A fast hierarchical method for multi-script and arbitrary oriented scene text extraction
    Lluis Gomez
    Dimosthenis Karatzas
    International Journal on Document Analysis and Recognition (IJDAR), 2016, 19 : 335 - 349
  • [24] A survey on camera-captured scene text detection and extraction: towards Gurmukhi script
    Kaur A.
    Dhir R.
    Lehal G.S.
    International Journal of Multimedia Information Retrieval, 2017, 6 (2) : 115 - 142
  • [25] A fast hierarchical method for multi-script and arbitrary oriented scene text extraction
    Gomez, Lluis
    Karatzas, Dimosthenis
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2016, 19 (04) : 335 - 349
  • [26] ICDAR2017 Robust Reading Challenge on Multi-lingual Scene Text Detection and Script Identification - RRC-MLT
    Nayef, Nibal
    Yin, Fei
    Bizid, Imen
    Choi, Hyunsoo
    Feng, Yuan
    Karatzas, Dimosthenis
    Luo, Zhenbo
    Pal, Umapada
    Rigaud, Christophe
    Chazalon, Joseph
    Khlif, Wafa
    Luqman, Muhammad Muzzamil
    Burie, Jean-Christophe
    Liu, Cheng-Lin
    Ogier, Jean-Marc
    2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 1454 - 1459
  • [27] Attention and Language Ensemble for Scene Text Recognition with Convolutional Sequence Modeling
    Fang, Shancheng
    Xie, Hongtao
    Zha, Zheng-Jun
    Sun, Nannan
    Tan, Jianlong
    Zhang, Yongdong
    PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, : 248 - 256
  • [28] Script identification in handwritten and printed documents using convolutional recurrent connection
    Jindal A.
    Multimedia Tools and Applications, 2025, 84 (9) : 5549 - 5563
  • [29] The Design of a Script Identification Algorithm and Its Application in Constructing a Text Language Identification Dataset
    Qasim, Mamtimin
    Silamu, Wushour
    Qiu, Minghui
    DATA, 2024, 9 (11)
  • [30] Scene Classification Based on Local Binary Pattern and Improved Bag of Visual Words
    Montazer, Gholam Ali
    Giveki, Davar
    Soltanshahi, Mohammad Ali
    ADVANCES IN COMPUTATIONAL INTELLIGENCE, PT I (IWANN 2015), 2015, 9094 : 241 - 251