SideNet: Learning representations from interactive side information for zero-shot Chinese character recognition

被引：6

作者：

Li, Ziyan ^{[1
]}

Huang, Yuhao ^{[1
]}

Peng, Dezhi ^{[1
]}

He, Mengchao ^{[2
]}

Jin, Lianwen ^{[1
]}

机构：

[1] South China Univ Technol, Sch Elect & Informat Engn, Guangzhou 510641, Peoples R China

[2] Alibaba Grp, DAMO Acad, Hangzhou, Peoples R China

来源：

PATTERN RECOGNITION | 2024年 / 148卷

关键词：

Optical character recognition; Chinese character recognition; Zero shot learning; Open set recognition; STROKE EXTRACTION; NETWORK;

D O I：

10.1016/j.patcog.2023.110208

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Existing methods for zero-shot Chinese character recognition usually exploit a single type of side information such as radicals, glyphs, or strokes to establish a mapping with the input characters for the recognition of unseen categories. However, these approaches have two limitations. Firstly, the mappings are inefficient owing to their complexity. Some existing methods design radical-level mappings using a non-differentiable dictionary matching strategy, whereas others construct sophisticated embeddings to map seen and unseen characters into a unified latent space. Although the latter approach is straightforward, it lacks a learnable scheme for explicit structure construction. Secondly, the complementarity within multiple types of side information has not been effectively explored. For example, the radicals provide structural knowledge at an abstract level, whereas glyphs offer detailed information on their figurative counterparts. To this end, we propose a new method called SideNet that jointly learns character-level representations assisted by two types of interactive side information: radicals and glyphs. SideNet contains a structural conversion module that extracts radical knowledge via dimensional decomposition, and a spatial conversion module that encodes the radical counting map to produce an interactive outcome between radicals and glyph. Finally, we propose a new classifier that integrates the converted features by a similarity-guided fusion mechanism. To the best of our knowledge, this study represents the first attempt to integrate these two types of side information and explore a joint representation for zero-shot learning. Experiments show that SideNet consistently outperforms existing methods by a significant margin in diverse scenarios, including handwriting, printed art, natural scenes, and ancient Chinese characters, which demonstrates the potential of joint learning with multiple types of side information.

引用

页数：14

共 42 条

[1] Cross-modal prototype learning for zero-shot handwritten character recognition
Ao, Xiang
Zhang, Xu-Yao
Liu, Cheng-Lin
[J]. PATTERN RECOGNITION, 2022, 131
[2] Pooled hybrid-spectral for hyperspectral image classification
Banerjee, Anasua
Banik, Debajyoty
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (07) : 10887 - 10899
[3] Zero-shot Handwritten Chinese Character Recognition with hierarchical decomposition embedding
Cao, Zhong
Lu, Jiang
Cui, Sen
Zhang, Changshui
[J]. PATTERN RECOGNITION, 2020, 107
[4] Domain-Specific Batch Normalization for Unsupervised Domain Adaptation
Chang, Woong-Gi
You, Tackgeun
Seo, Seonguk
Kwak, Suha
Han, Bohyung
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 7346 - 7354
[5] Chen JY, 2021, PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, P615
[6] Chuhan Zhang, 2020, Computer Vision - ECCV 2020 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12361), P51, DOI 10.1007/978-3-030-58517-4_4
[7] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[8] Nguyen HV, 2011, LECT NOTES COMPUT SC, V6493, P709
[9] Densely Connected Convolutional Networks
Huang, Gao
Liu, Zhuang
van der Maaten, Laurens
Weinberger, Kilian Q.
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2261 - 2269
[10] Hippocampus-heuristic character recognition network for zero-shot learning in Chinese character recognition
Huang, Guanjie
Luo, Xiangyu
Wang, Shaowei
Gu, Tianlong
Su, Kaile
[J]. PATTERN RECOGNITION, 2022, 130

← 1 2 3 4 5 →