Towards open-set text recognition via label-to-prototype learning

被引：32

作者：

Liu, Chang ^{[1
]}

Yang, Chun ^{[1
]}

Qin, Hai-Bo ^{[1
]}

Zhu, Xiaobin ^{[1
]}

Liu, Cheng-Lin ^{[2
]}

Yin, Xu-Cheng ^{[1
]}

机构：

[1] Univ Sci & Technol Beijing, Sch Comp & Commun Engn, Dept Comp Sci & Technol, Beijing 100083, Peoples R China

[2] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China

来源：

PATTERN RECOGNITION | 2023年 / 134卷

基金：

中国国家自然科学基金;

关键词：

Open-set recognition; Scene text recognition; Low-shot recognition; NETWORK; CLASSIFICATION;

D O I：

10.1016/j.patcog.2022.109109

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Scene text recognition is a popular research topic which is also extensively utilized in the industry. Al-though many methods have achieved satisfactory performance for the close-set text recognition chal-lenges, these methods lose feasibility in open-set scenarios, where collecting data or retraining models for novel characters could yield a high cost. For example, annotating samples for foreign languages can be expensive, whereas retraining the model each time when a "novel" character is discovered from historical documents costs both time and resources. In this paper, we introduce and formulate a new open-set text recognition task which demands the capability to spot and recognize novel characters without retrain-ing. A label-to-prototype learning framework is also proposed as a baseline for the new task. Specifically, the framework introduces a generalizable label-to-prototype mapping function to build prototypes (class centers) for both seen and unseen classes. An open-set predictor is then utilized to recognize or reject samples according to the prototypes. The implementation of rejection capability over out-of-set charac-ters allows automatic spotting of unknown characters in the incoming data stream. Extensive experiments show that our method achieves promising performance on a variety of zero-shot, close-set, and open-set text recognition datasets. (c) 2022 Elsevier Ltd. All rights reserved.

引用

页数：13

共 49 条

[1]

Souibgui MA, 2022, Arxiv, DOI arXiv:2107.10064

[2]

[Anonymous], 2014, P BRIT MACH VIS C, DOI DOI 10.5244/C.28.88

[3] What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis [J].

Baek, Jeonghun ;

Kim, Geewook ;

Lee, Junyeop ;

Park, Sungrae ;

Han, Dongyoon ;

Yun, Sangdoo ;

Oh, Seong Joon ;

Lee, Hwalsuk .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :4714-4722

[4] Rosetta: Large Scale System for Text Detection and Recognition in Images [J].

Borisyuk, Fedor ;

Gordo, Albert ;

Sivakumar, Viswanath .

KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, :71-79

[5] Zero-shot Handwritten Chinese Character Recognition with hierarchical decomposition embedding [J].

Cao, Zhong ;

Lu, Jiang ;

Cui, Sen ;

Zhang, Changshui .

PATTERN RECOGNITION, 2020, 107

[6]

Chee Kheng Chng, 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR). Proceedings, P1571, DOI 10.1109/ICDAR.2019.00252

[7]

Chen JY, 2021, PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, P615

[8] AON: Towards Arbitrarily-Oriented Text Recognition [J].

Cheng, Zhanzhan ;

Xu, Yangliu ;

Bai, Fan ;

Niu, Yi ;

Pu, Shiliang ;

Zhou, Shuigeng .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5571-5579

[9] Deformable Convolutional Networks [J].

Dai, Jifeng ;

Qi, Haozhi ;

Xiong, Yuwen ;

Li, Yi ;

Zhang, Guodong ;

Hu, Han ;

Wei, Yichen .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :764-773

[10]

Fei G., 2016, P 2016 C N AM CHAPT, P506, DOI DOI 10.18653/V1/N16-1061

← 1 2 3 4 5 →