SideNet: Learning representations from interactive side information for zero-shot Chinese character recognition

被引:6
作者
Li, Ziyan [1 ]
Huang, Yuhao [1 ]
Peng, Dezhi [1 ]
He, Mengchao [2 ]
Jin, Lianwen [1 ]
机构
[1] South China Univ Technol, Sch Elect & Informat Engn, Guangzhou 510641, Peoples R China
[2] Alibaba Grp, DAMO Acad, Hangzhou, Peoples R China
关键词
Optical character recognition; Chinese character recognition; Zero shot learning; Open set recognition; STROKE EXTRACTION; NETWORK;
D O I
10.1016/j.patcog.2023.110208
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing methods for zero-shot Chinese character recognition usually exploit a single type of side information such as radicals, glyphs, or strokes to establish a mapping with the input characters for the recognition of unseen categories. However, these approaches have two limitations. Firstly, the mappings are inefficient owing to their complexity. Some existing methods design radical-level mappings using a non-differentiable dictionary matching strategy, whereas others construct sophisticated embeddings to map seen and unseen characters into a unified latent space. Although the latter approach is straightforward, it lacks a learnable scheme for explicit structure construction. Secondly, the complementarity within multiple types of side information has not been effectively explored. For example, the radicals provide structural knowledge at an abstract level, whereas glyphs offer detailed information on their figurative counterparts. To this end, we propose a new method called SideNet that jointly learns character-level representations assisted by two types of interactive side information: radicals and glyphs. SideNet contains a structural conversion module that extracts radical knowledge via dimensional decomposition, and a spatial conversion module that encodes the radical counting map to produce an interactive outcome between radicals and glyph. Finally, we propose a new classifier that integrates the converted features by a similarity-guided fusion mechanism. To the best of our knowledge, this study represents the first attempt to integrate these two types of side information and explore a joint representation for zero-shot learning. Experiments show that SideNet consistently outperforms existing methods by a significant margin in diverse scenarios, including handwriting, printed art, natural scenes, and ancient Chinese characters, which demonstrates the potential of joint learning with multiple types of side information.
引用
收藏
页数:14
相关论文
共 42 条
  • [1] Cross-modal prototype learning for zero-shot handwritten character recognition
    Ao, Xiang
    Zhang, Xu-Yao
    Liu, Cheng-Lin
    [J]. PATTERN RECOGNITION, 2022, 131
  • [2] Pooled hybrid-spectral for hyperspectral image classification
    Banerjee, Anasua
    Banik, Debajyoty
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (07) : 10887 - 10899
  • [3] Zero-shot Handwritten Chinese Character Recognition with hierarchical decomposition embedding
    Cao, Zhong
    Lu, Jiang
    Cui, Sen
    Zhang, Changshui
    [J]. PATTERN RECOGNITION, 2020, 107
  • [4] Domain-Specific Batch Normalization for Unsupervised Domain Adaptation
    Chang, Woong-Gi
    You, Tackgeun
    Seo, Seonguk
    Kwak, Suha
    Han, Bohyung
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 7346 - 7354
  • [5] Chen JY, 2021, PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, P615
  • [6] Chuhan Zhang, 2020, Computer Vision - ECCV 2020 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12361), P51, DOI 10.1007/978-3-030-58517-4_4
  • [7] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [8] Nguyen HV, 2011, LECT NOTES COMPUT SC, V6493, P709
  • [9] Densely Connected Convolutional Networks
    Huang, Gao
    Liu, Zhuang
    van der Maaten, Laurens
    Weinberger, Kilian Q.
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2261 - 2269
  • [10] Hippocampus-heuristic character recognition network for zero-shot learning in Chinese character recognition
    Huang, Guanjie
    Luo, Xiangyu
    Wang, Shaowei
    Gu, Tianlong
    Su, Kaile
    [J]. PATTERN RECOGNITION, 2022, 130