A Feature Refinement Patch Embedding-Based Recognition Method for Printed Tibetan Cursive Script

被引:0
作者
Zhi, Cai Rang Dang [1 ,2 ,3 ]
Huang, Heming [1 ,2 ,3 ]
Fan, Yonghong [1 ,2 ,3 ]
Song, Dongke [1 ,2 ,3 ]
机构
[1] Qinghai Normal Univ, Sch Comp Sci & Technol, Xining 810008, Peoples R China
[2] State Key Lab Tibetan Intelligent Informat Proc &, Xining 810008, Peoples R China
[3] Minist Educ, Key Lab Tibetan Informat Proc, Xining 810008, Peoples R China
来源
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT II | 2024年 / 14426卷
基金
美国国家科学基金会;
关键词
Tibetan recognition; cursive scripts; feature refinement patch embedding; Transformer; TEXT; TRANSFORMER;
D O I
10.1007/978-981-99-8432-9_31
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recognition of Tibetan cursive scripts has important applications in the field of automated Tibetan office software and ancient document conservation. However, there are few studies on recognition of Tibetan cursive scripts. This paper proposes a printed Tibetan cursive script recognition method based on feature refinement patch embedding. Firstly, the feature refinement patch embedding module (FRPE) serializes the line text image of feature sequences. Secondly, a global modeling of feature vectors is carried out by using a single transformer encoder. Finally, the output of the recognition result is decoded by using a fully connected layer. Experimental results show that, compared with the baseline model, the proposed method improves the accuracy by 9.52% on the dataset CSTPD, a database containing six Tibetan cursive fonts. Moreover, it achieves an average accuracy rate of 92.5% on the dataset CSTPD. Similarly, it also works better than the baseline model on Tibetan text recognition synthetic data for natural scene images.
引用
收藏
页码:383 / 399
页数:17
相关论文
共 25 条
  • [1] Vision Transformer for Fast and Efficient Scene Text Recognition
    Atienza, Rowel
    [J]. DOCUMENT ANALYSIS AND RECOGNITION - ICDAR 2021, PT I, 2021, 12821 : 319 - 334
  • [2] Text Recognition in the Wild: A Survey
    Chen, Xiaoxue
    Jin, Lianwen
    Zhu, Yuanzhi
    Luo, Canjie
    Wang, Tianwei
    [J]. ACM COMPUTING SURVEYS, 2021, 54 (02)
  • [3] Chen Y., 2020, Design and Implementation of Printed Tibetan Language Recognition Software on Android Platform
  • [4] Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
  • [5] Du Y., 2022, 31 INT JOINT C ARTIF, P12593
  • [6] Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition
    Fang, Shancheng
    Xie, Hongtao
    Wang, Yuxin
    Mao, Zhendong
    Zhang, Yongdong
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 7094 - 7103
  • [7] Look Back Again: Dual Parallel Attention Network for Accurate and Robust Scene Text Recognition
    Fu, Zilong
    Xie, Hongtao
    Jin, Guoqing
    Guo, Junbo
    [J]. PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 638 - 644
  • [8] SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition
    Huang, Mingxin
    Liu, Yuliang
    Peng, Zhenghao
    Liu, Chongyu
    Lin, Dahua
    Zhu, Shenggao
    Yuan, Nicholas
    Ding, Kai
    Jin, Lianwen
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 4583 - 4593
  • [9] Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer
    Kittenplon, Yair
    Lavi, Inbal
    Fogel, Sharon
    Bar, Yarin
    Manmatha, R.
    Perona, Pietro
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 4594 - 4603
  • [10] ImageNet Classification with Deep Convolutional Neural Networks
    Krizhevsky, Alex
    Sutskever, Ilya
    Hinton, Geoffrey E.
    [J]. COMMUNICATIONS OF THE ACM, 2017, 60 (06) : 84 - 90