Bag of Local Convolutional Triplets for Script Identification in Scene Text

被引:9
|
作者
Zdenek, Jan [1 ]
Nakayama, Hideki [1 ]
机构
[1] Univ Tokyo, Grad Sch Informat Sci & Technol, Tokyo, Japan
来源
2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1 | 2017年
关键词
script identification; scene text; convolutional neural networks; bag-of-visual words;
D O I
10.1109/ICDAR.2017.68
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The increasing interest in scene text reading in multilingual environments raises the need to recognize and distinguish between different writing systems. In this paper, we propose a novel method for script identification in scene text using triplets of local convolutional features in combination with the traditional bag-of-visual-words model. Feature triplets are created by making combinations of descriptors extracted from local patches of the input images using a convolutional neural network. This approach allows us to generate a more descriptive codeword dictionary for the bag-of-visual-words model, as the low discriminative power of weak descriptors is enhanced by other descriptors in a triplet. The proposed method is evaluated on two public benchmark datasets for scene text script identification and a public dataset for script identification in video captions. The experiments demonstrate that our method outperforms the baseline and yields competitive results on all three datasets.
引用
收藏
页码:369 / 375
页数:7
相关论文
共 50 条
  • [31] Fine-Grained Language Identification in Scene Text Images
    Li, Yongrui
    Wu, Shilian
    Yu, Jun
    Wang, Zengfu
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4573 - 4581
  • [32] Multi-lingual scene text detection and language identification
    Saha, Shaswata
    Chakraborty, Neelotpal
    Kundu, Soumyadeep
    Paul, Sayantan
    Mollah, Ayatullah Faruk
    Basu, Subhadip
    Sarkar, Ram
    PATTERN RECOGNITION LETTERS, 2020, 138 : 16 - 22
  • [33] Word Level Script Identification Using Convolutional Neural Network Enhancement for Scenic Images
    Mahajan, Shilpa
    Rani, Rajneesh
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (04)
  • [34] ConvPatchTrans: A script identification network with global and local semantics deeply integrated
    Yang, Ke
    Yi, Jizheng
    Chen, Aibin
    Liu, Jiaqi
    Chen, Wenjie
    Jin, Ze
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 113
  • [35] SANet-SI: A new Self-Attention-Network for Script Identification in scene images
    Li, Xiaomeng
    Zhan, Hongjian
    Shivakumara, Palaiahnakote
    Pal, Umapada
    Lu, Yue
    PATTERN RECOGNITION LETTERS, 2023, 171 : 45 - 52
  • [36] A New Bottom-Up Path Augmentation Attention Network for Script Identification in Scene Images
    Pan, Zhi
    Yang, Yaowei
    Ubul, Kurban
    Aysa, Alimjan
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT V, 2024, 14808 : 227 - 244
  • [37] Temporal Integration for Word-Wise Caption and Scene Text Identification
    Roy, Sangheeta
    Shivakumara, Palaiahnakote
    Pal, Umapada
    Lu, Tong
    Wahab, Ainuddin Wahid Bin Abdul
    2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 349 - 354
  • [38] EMBiL: An English-Manipuri Bi-lingual Benchmark for Scene Text Detection and Language Identification
    Naosekpam, Veronica
    Islam, Mushtaq
    Chourasia, Amul
    Sahu, Nilkanta
    COMPUTER ANALYSIS OF IMAGES AND PATTERNS, CAIP 2023, PT I, 2023, 14184 : 65 - 75
  • [39] Cursive Text Recognition in Natural Scene Images Using Deep Convolutional Recurrent Neural Network
    Chandio, Asghar Ali
    Asikuzzaman, MD.
    Pickering, Mark R.
    Leghari, Mehwish
    IEEE ACCESS, 2022, 10 : 10062 - 10078
  • [40] Language identification from multi-lingual scene text images: a CNN based classifier ensemble approach
    Chakraborty, Neelotpal
    Kundu, Soumyadeep
    Paul, Sayantan
    Mollah, Ayatullah Faruk
    Basu, Subhadip
    Sarkar, Ram
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 12 (07) : 7997 - 8008