Mining discriminative patches for script identification in natural scene images

被引:6
|
作者
Lu, Liqiong [1 ,2 ]
Wu, Dong [1 ]
Tang, Ziwei [2 ]
Yi, Yaohua [2 ]
Huang, Faliang [3 ]
机构
[1] Lingnan Normal Univ, Dept Informat Engn, Zhanjiang, Peoples R China
[2] Wuhan Univ, Sch Printing & Packaging, Wuhan, Peoples R China
[3] Nanning Normal Univ, Sch Comp & Informat Engn, Nanning, Peoples R China
关键词
Script identification; score CNN; attention CNN; discriminative patches; scene images; WORD;
D O I
10.3233/JIFS-200260
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper focuses on script identification in natural scene images. Traditional CNNs (Convolution Neural Networks) cannot solve this problem perfectly for two reasons: one is the arbitrary aspect ratios of scene images which bring much difficulty to traditional CNNs with a fixed size image as the input. And the other is that some scripts with minor differences are easily confused because they share a subset of characters with the same shapes. We propose a novel approach combing Score CNN, Attention CNN and patches. Attention CNN is utilized to determine whether a patch is a discriminative patch and calculate the contribution weight of the discriminative patch to script identification of the whole image. Score CNN uses a discriminative patch as input and predict the score of each script type. Firstly patches with the same size are extracted from the scene images. Secondly these patches are used as inputs to Score CNN and Attention CNN to train two patch-level classifiers. Finally, the results of multiple discriminative patches extracted from the same image via the above two classifiers are fused to obtain the script type of this image. Using patches with the same size as inputs to CNN can avoid the problems caused by arbitrary aspect ratios of scene images. The trained classifiers can mine discriminative patches to accurately identify some confusing scripts. The experimental results show the good performance of our approach on four public datasets.
引用
收藏
页码:551 / 563
页数:13
相关论文
共 41 条
  • [1] Text detection, recognition, and script identification in natural scene images: a Review
    Veronica Naosekpam
    Nilkanta Sahu
    International Journal of Multimedia Information Retrieval, 2022, 11 : 291 - 314
  • [2] Text detection, recognition, and script identification in natural scene images: a Review
    Naosekpam, Veronica
    Sahu, Nilkanta
    INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2022, 11 (03) : 291 - 314
  • [3] Integrating Local CNN and Global CNN for Script Identification in Natural Scene Images
    Lu, Liqiong
    Yi, Yaohua
    Huang, Faliang
    Wang, Kaili
    Wang, Qi
    IEEE ACCESS, 2019, 7 : 52669 - 52679
  • [4] Text detection and script identification in natural scene images using deep learning
    Khalil, Ashwaq
    Jarrah, Moath
    Al-Ayyoub, Mahmoud
    Jararweh, Yaser
    COMPUTERS & ELECTRICAL ENGINEERING, 2021, 91
  • [5] A Method of Text Detection and Script Identification in Natural Scene
    Yang, Yaowei
    Ibrahim, Galip
    Zhu, Yali
    Mamat, Hornisa
    Ubul, Kurban
    2022 INTERNATIONAL CONFERENCE ON VIRTUAL REALITY, HUMAN-COMPUTER INTERACTION AND ARTIFICIAL INTELLIGENCE, VRHCIAI, 2022, : 43 - 48
  • [6] Word-Level Script Identification from Scene Images
    Fasil, O. K.
    Manjunath, S.
    Aradhya, V. N. Manjunath
    PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON FRONTIERS IN INTELLIGENT COMPUTING: THEORY AND APPLICATIONS, (FICTA 2016), VOL 2, 2017, 516 : 417 - 426
  • [7] Residual attention-based multi-scale script identification in scene text images
    Ma, Mengkai
    Wang, Qiu-Feng
    Huang, Shan
    Huang, Shen
    Goulermas, Yannis
    Huang, Kaizhu
    NEUROCOMPUTING, 2021, 421 : 222 - 233
  • [8] SANet-SI: A new Self-Attention-Network for Script Identification in scene images
    Li, Xiaomeng
    Zhan, Hongjian
    Shivakumara, Palaiahnakote
    Pal, Umapada
    Lu, Yue
    PATTERN RECOGNITION LETTERS, 2023, 171 : 45 - 52
  • [9] A New Bottom-Up Path Augmentation Attention Network for Script Identification in Scene Images
    Pan, Zhi
    Yang, Yaowei
    Ubul, Kurban
    Aysa, Alimjan
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT V, 2024, 14808 : 227 - 244
  • [10] Residual attention-based multi-scale script identification in scene text images
    Ma M.
    Wang Q.-F.
    Huang S.
    Huang S.
    Goulermas Y.
    Huang K.
    Neurocomputing, 2021, 421 : 222 - 233