SANet-SI: A new Self-Attention-Network for Script Identification in scene images

被引:3
|
作者
Li, Xiaomeng [1 ]
Zhan, Hongjian [1 ,2 ]
Shivakumara, Palaiahnakote [3 ]
Pal, Umapada [4 ]
Lu, Yue [1 ]
机构
[1] East China Normal Univ, Sch Elect & Comp Engn, Shanghai 200241, Peoples R China
[2] East China Normal Univ, Chongqing Inst, Chongqing 401120, Peoples R China
[3] Univ Malaya UM, Fac Comp Sci & Informat Technol, Kuala Lumpur 50603, Malaysia
[4] Indian Statict Inst, CVPR Unit, Kolkata 700108, India
关键词
Script identification; Feature fusion; Language identification;
D O I
10.1016/j.patrec.2023.04.015
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Developing an automatic method for identifying scripts in natural scene text images is of great impor-tance for improving performance of multilingual OCR. This paper presents a new Self-Attention Network (SANet-SI) for script identification in natural scene text images. The rationale behind proposing SANet-SI is that each script exhibits its own pattern because of different characteristics of scripts. To extract such observations, we explore self-attention-based CNN with a multi-scale feature extraction approach. The proposed multi-scale feature extraction involves local, global features extraction and fusion of both the features. Furthermore, to extract dominant features from the pool of features that contribute more for script identification, we explore Style-based Recalibration Module (SRM) in a new way. In addition, to improve the performance of the identification and reduce the model size, the proposed model uses the Global Average Pooling (GAP) layer, instead of Fully Connected(FC) layers in this work. The proposed model is evaluated on standard datasets, namely, RRC-MLT2017, SIW-13, and CVSI2015 to show effective-ness over state-of-the-art methods in terms of confusion matrix and classification rate. In addition, we also conducted experiments for Cross Dataset Validation to show that the proposed model is independent of the number of scripts and different datasets.(c) 2023 Elsevier B.V. All rights reserved.
引用
收藏
页码:45 / 52
页数:8
相关论文
共 19 条
  • [1] A New Bottom-Up Path Augmentation Attention Network for Script Identification in Scene Images
    Pan, Zhi
    Yang, Yaowei
    Ubul, Kurban
    Aysa, Alimjan
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT V, 2024, 14808 : 227 - 244
  • [2] Residual attention-based multi-scale script identification in scene text images
    Ma, Mengkai
    Wang, Qiu-Feng
    Huang, Shan
    Huang, Shen
    Goulermas, Yannis
    Huang, Kaizhu
    NEUROCOMPUTING, 2021, 421 : 222 - 233
  • [3] Residual attention-based multi-scale script identification in scene text images
    Ma M.
    Wang Q.-F.
    Huang S.
    Huang S.
    Goulermas Y.
    Huang K.
    Neurocomputing, 2021, 421 : 222 - 233
  • [4] A Hybrid Scene Text Script Identification Network for Regional Indian Languages
    Naosekpam, Veronica
    Sahu, Nilkanta
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (08)
  • [5] Mining discriminative patches for script identification in natural scene images
    Lu, Liqiong
    Wu, Dong
    Tang, Ziwei
    Yi, Yaohua
    Huang, Faliang
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 40 (01) : 551 - 563
  • [6] Word-Level Script Identification from Scene Images
    Fasil, O. K.
    Manjunath, S.
    Aradhya, V. N. Manjunath
    PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON FRONTIERS IN INTELLIGENT COMPUTING: THEORY AND APPLICATIONS, (FICTA 2016), VOL 2, 2017, 516 : 417 - 426
  • [7] Text detection, recognition, and script identification in natural scene images: a Review
    Veronica Naosekpam
    Nilkanta Sahu
    International Journal of Multimedia Information Retrieval, 2022, 11 : 291 - 314
  • [8] Text detection, recognition, and script identification in natural scene images: a Review
    Naosekpam, Veronica
    Sahu, Nilkanta
    INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2022, 11 (03) : 291 - 314
  • [9] Script identification in natural scene image and video frames using an attention based Convolutional-LSTM network
    Bhunia, Ankan Kumar
    Konwer, Aishik
    Bhunia, Ayan Kumar
    Bhowmick, Abir
    Roy, Partha P.
    Pal, Umapada
    PATTERN RECOGNITION, 2019, 85 : 172 - 184
  • [10] Integrating Local CNN and Global CNN for Script Identification in Natural Scene Images
    Lu, Liqiong
    Yi, Yaohua
    Huang, Faliang
    Wang, Kaili
    Wang, Qi
    IEEE ACCESS, 2019, 7 : 52669 - 52679