Generating Text Sequence Images for Recognition

被引：5

作者：

Gong, Yanxiang ^{[1
]}

Deng, Linjie ^{[1
]}

Ma, Zheng ^{[1
]}

Xie, Mei ^{[1
]}

机构：

[1] Univ Elect Sci & Technol China, Sch Informat & Commun Engn, 2006 Xiyuan Ave, Chengdu 611731, Sichuan, Peoples R China

来源：

NEURAL PROCESSING LETTERS | 2020年 / 51卷 / 02期

关键词：

Image generation; Text sequence images; Training data; Text recognition;

D O I：

10.1007/s11063-019-10166-x

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recently, methods based on deep learning have dominated the field of text recognition. With a large number of training data, most of them can achieve the state-of-the-art performances. However, it is hard to harvest and label sufficient text sequence images from the real scenes. To mitigate this issue, several methods to synthesize text sequence images were proposed, yet they usually need complicated preceding or follow-up steps. In this work, we present a method which is able to generate infinite training data without any auxiliary pre/post-process. We tackle the generation task as an image-to-image translation one and utilize conditional adversarial networks to produce realistic text sequence images in the light of the semantic ones. Some evaluation metrics are involved to assess our method and the results demonstrate that the caliber of the data is satisfactory. The code and dataset will be publicly available soon.

引用

页码：1677 / 1688

页数：12

共 34 条

[1]

[Anonymous], 1999, Newsgroup 20 dataset

[2]

[Anonymous], 2015, ARXIV151109207

[3] What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis [J].

Baek, Jeonghun ;

Kim, Geewook ;

Lee, Junyeop ;

Park, Sungrae ;

Han, Dongyoon ;

Yun, Sangdoo ;

Oh, Seong Joon ;

Lee, Hwalsuk .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :4714-4722

[4]

BAI F, 2018, ARXIV180503384

[5]

Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672

[6]

Graves A., 2006, PROC ICML, P369, DOI DOI 10.1145/1143844.1143891

[7] Synthetic Data for Text Localisation in Natural Images [J].

Gupta, Ankush ;

Vedaldi, Andrea ;

Zisserman, Andrew .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2315-2324

[8] Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1026-1034

[9]

HE KM, 2016, PROC CVPR IEEE, P770, DOI DOI 10.1109/CVPR.2016.90

[10]

Heusel M, 2017, ADV NEUR IN, V30

← 1 2 3 4 →