SMILE: SEQUENCE-TO-SEQUENCE DOMAIN ADAPTATION WITH MINIMIZING LATENT ENTROPY FOR TEXT IMAGE RECOGNITION

被引：7

作者：

Chang, Yen-Cheng ^{[1
]}

Chen, Yi-Chang ^{[1
]}

Chang, Yu-Chuan ^{[1
]}

Yeh, Yi-Ren ^{[2
]}

机构：

[1] E SUN Financial Holding Co Ltd, Taipei, Taiwan

[2] Natl Kaohsiung Normal Univ, Dept Math, Kaohsiung, Taiwan

来源：

2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP | 2022年

关键词：

domain adaptation; sequence-to-sequence; entropy minimization; self-paced learning;

D O I：

10.1109/ICIP46576.2022.9897599

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Excellent text recognition results have been obtained by training recognition models with synthetic images. However, recognizing text from real-world images still faces challenges due to the domain shift between synthetic and real-world text images. One strategy to eliminate this domain difference without manual annotation is unsupervised domain adaptation (UDA). Due to the characteristics of sequential labeling tasks, most popular UDA methods cannot be directly applied to text recognition. To tackle this problem, we proposed a UDA method that minimizes latent entropy on sequence-to-sequence attention-based models with class-balanced self-paced learning. Experimental results show that our proposed framework achieves better recognition results than the existing methods on most UDA text recognition benchmarks. All codes are publicly available(1).

引用

页码：431 / 435

页数：5

共 26 条

[1]

[Anonymous], 2007, A Kernel Approach to Comparing Distributions

[2] What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis [J].

Baek, Jeonghun ;

Kim, Geewook ;

Lee, Junyeop ;

Park, Sungrae ;

Han, Dongyoon ;

Yun, Sangdoo ;

Oh, Seong Joon ;

Lee, Hwalsuk .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :4714-4722

[3]

Chen Y., 2021, arXiv

[4] Focusing Attention: Towards Accurate Text Recognition in Natural Images [J].

Cheng, Zhanzhan ;

Bai, Fan ;

Xu, Yunlu ;

Zheng, Gang ;

Pu, Shiliang ;

Zhou, Shuigeng .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :5086-5094

[5]

Ganin Y, 2015, PR MACH LEARN RES, V37, P1180

[6]

Grandvalet Y., 2005, NeurIPS

[7] Synthetic Data for Text Localisation in Natural Images [J].

Gupta, Ankush ;

Vedaldi, Andrea ;

Zisserman, Andrew .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2315-2324

[8]

Jaderberg M., 2014, ARXIV

[9]

JianfengWang Xiaolin, 2017, NIPS

[10]

Kang L, 2020, IEEE WINT CONF APPL, P3491, DOI 10.1109/WACV45572.2020.9093392

← 1 2 3 →