A Scene Tibetan Text Detection by Combining Multi-scale and Dual-Channel Features

被引:0
|
作者
Dangzhi, Cairang [1 ,2 ,3 ]
Huang, Heming [1 ,2 ,3 ]
Fan, Yonghong [1 ,2 ,3 ]
Fan, Yutao [1 ,2 ,3 ]
机构
[1] Qinghai Normal Univ, Sch Comp, Xining 810008, Peoples R China
[2] State Key Lab Tibetan Intelligent Informat Proc &, Xining 810008, Peoples R China
[3] Minist Educ, Key Lab Tibetan Informat Proc, Xining 810008, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Multi-scale Feature; Dual-channel Attention; Scene Tibetan Text Detection; Skip Connections; YOLO;
D O I
10.1007/978-3-031-61816-1_11
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Tibetan text detection in scenes plays a vital role in various applications, including image search, real-time translation, and the preservation of Tibetan cultural heritage. However, recognizing Tibetan text in natural scene images is a challenging task due to factors such as variable fonts, complex backgrounds, and poor imaging conditions. In this study, we present a novel approach called Multi-Scale Dual-Channel Feature Fusion (MDFF) for Tibetan scene text detection. Our method aims to accurately infer text in complex scenes by leveraging multi-scale interactions between texts. MDFF incorporates a feature pyramid network with skip connections, enabling the fusion of features at different scales in a hierarchical manner. Additionally, we employ a dual-channel attention (DCA) mechanism to capture rich interactions between text instances while mitigating the impact of background noise. Experimental results on the scene Tibetan text detection database (STTDD) demonstrate the effectiveness of MDFF, achieving an impressive F1 score of 85.20%. Our proposed method outperforms the baseline model by 5 percentage points and surpasses the performance of six state-of-the-art methods in single Tibetan text detection.
引用
收藏
页码:158 / 171
页数:14
相关论文
共 50 条
  • [1] MULTI-SCALE SCENE TEXT DETECTION VIA RESOLUTION TRANSFORM
    Cheng, Peirui
    Wang, Weiqiang
    Cai, Yuanqiang
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 988 - 993
  • [2] MSR: Multi-Scale Shape Regression for Scene Text Detection
    Xue, Chuhui
    Lu, Shijian
    Zhang, Wei
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 989 - 995
  • [3] Image splicing manipulation location by multi-scale dual-channel supervision
    Jingyun Hu
    Ru Xue
    Guofeng Teng
    Shiming Niu
    Danyang Jin
    Multimedia Tools and Applications, 2024, 83 : 31759 - 31782
  • [4] Image splicing manipulation location by multi-scale dual-channel supervision
    Hu, Jingyun
    Xue, Ru
    Teng, Guofeng
    Niu, Shiming
    Jin, Danyang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (11) : 31759 - 31782
  • [5] SCENE TEXT DETECTION BASED ON MULTI-SCALE SWT AND EDGE FILTERING
    Feng, Yuanyuan
    Song, Yonghong
    YualinZhang
    2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 645 - 650
  • [6] Multi-Scale Scene Text Detection Based on Convolutional Neural Network
    Lu, Yan-Feng
    Zhang, Ai-Xuan
    Li, Yi
    Yu, Qian-Hui
    Qiao, Hong
    2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 583 - 587
  • [7] Cloud Detection of Remote Sensing Image Based on Multi-Scale Data and Dual-Channel Attention Mechanism
    Yan, Qing
    Liu, Hu
    Zhang, Jingjing
    Sun, Xiaobing
    Xiong, Wei
    Zou, Mingmin
    Xia, Yi
    Xun, Lina
    REMOTE SENSING, 2022, 14 (15)
  • [8] Deep multi-scale dual-channel convolutional neural network for Internet of Things apple disease detection
    Zhang, Wenzhuo
    Zhou, Guoxiong
    Chen, Aibin
    Hu, Yahui
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2022, 194
  • [9] Multi-scale dual-channel feature embedding decoder for biomedical image segmentation
    Agarwal, Rohit
    Ghosal, Palash
    Sadhu, Anup K.
    Murmu, Narayan
    India, Debashis Nandi
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2024, 257
  • [10] MS-ROCANET: MULTI-SCALE RESIDUAL ORTHOGONAL-CHANNEL ATTENTION NETWORK FOR SCENE TEXT DETECTION
    Liu, Jinpeng
    Wu, Song
    He, Dehong
    Xiao, Guoqiang
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2200 - 2204