Chinese Character Recognition based on Swin Transformer-Encoder ☆

被引:0
作者
Li, Ziying [1 ]
Zhao, Haifeng [1 ,2 ]
Nishizaki, Hiromitsu [2 ]
Leow, Chee Siang [2 ]
Shen, Xingfa [1 ]
机构
[1] Hangzhou Dianzi Univ, Comp Sci & Technol, Hangzhou 310018, Peoples R China
[2] Univ Yamanashi, Comp Sci & Technol, Kofu, Yamanashi 4000013, Japan
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
OCR; Chinese Text Recognition; Swin Transformer; Attention; Image segmentation; STROKE EXTRACTION; NETWORK;
D O I
10.1016/j.dsp.2025.105080
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Optical Character Recognition (OCR) technology, which converts printed or handwritten text into machinereadable text, holds significant application and research value in document digitization, information automation, and multilingual support. However, existing methods predominantly focus on English text recognition and often struggle with addressing the complexities of Chinese characters. This study proposes a Chinese text recognition model based on the Swin Transformer encoder, demonstrating its remarkable adaptability to Chinese character recognition. In the image preprocessing stage, we introduced an overlapping segmentation technique that enables the encoder to effectively capture the complex structural relationships between individual strokes in lengthy Chinese texts. Additionally, by incorporating a mapping layer between the encoder and decoder, we enhanced the Swin Transformer's adaptability to small image scenarios, thereby improving its feasibility for Chinese text recognition tasks. Experimental results indicate that this model outperforms classical models such as CRNN and ASTER on handwritten and web-based datasets, validating its robustness and reliability.
引用
收藏
页数:10
相关论文
共 38 条
  • [1] Kurdish Handwritten character recognition using deep learning techniques
    Ahmed, Rebin M.
    Rashid, Tarik A.
    Fattah, Polla
    Alsadoon, Abeer
    Bacanin, Nebojsa
    Mirjalili, Seyedali
    Vimal, S.
    Chhabra, Amit
    [J]. GENE EXPRESSION PATTERNS, 2022, 46
  • [2] Zero-shot Handwritten Chinese Character Recognition with hierarchical decomposition embedding
    Cao, Zhong
    Lu, Jiang
    Cui, Sen
    Zhang, Changshui
    [J]. PATTERN RECOGNITION, 2020, 107
  • [3] Chen J., 2021, P 30 INT JOINT C ART
  • [4] Scene Text Telescope: Text-Focused Scene Image Super-Resolution
    Chen, Jingye
    Li, Bin
    Xue, Xiangyang
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 12021 - 12030
  • [5] Stroke-Based Autoencoders: Self-Supervised Learners for Efficient Zero-Shot Chinese Character Recognition
    Chen, Zongze
    Yang, Wenxia
    Li, Xin
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (03):
  • [6] FINDING STRUCTURE IN TIME
    ELMAN, JL
    [J]. COGNITIVE SCIENCE, 1990, 14 (02) : 179 - 211
  • [7] Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition
    Fang, Shancheng
    Xie, Hongtao
    Wang, Yuxin
    Mao, Zhendong
    Zhang, Yongdong
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 7094 - 7103
  • [8] Graves A, 2012, STUD COMPUT INTELL, V385, P1, DOI [10.1007/978-3-642-24797-2, 10.1162/neco.1997.9.1.1]
  • [9] He MC, 2018, INT C PATT RECOG, P7, DOI 10.1109/ICPR.2018.8546143
  • [10] SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition
    Huang, Mingxin
    Liu, Yuliang
    Peng, Zhenghao
    Liu, Chongyu
    Lin, Dahua
    Zhu, Shenggao
    Yuan, Nicholas
    Ding, Kai
    Jin, Lianwen
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 4583 - 4593