Balanced Synthetic Data for Accurate Scene Text Spotting

被引:0
|
作者
Yao, Ying [1 ]
Huang, Zhangjin [2 ]
机构
[1] Univ Sci & Technol China, Sch Software Engn, Hefei 230051, Anhui, Peoples R China
[2] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei 230027, Anhui, Peoples R China
来源
TENTH INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING (ICDIP 2018) | 2018年 / 10806卷
关键词
synthesize and balance; text detection; text recognition; neural networks;
D O I
10.1117/12.2503258
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
Previous approaches for scene text detection or recognition have already achieved promising performances across various benchmarks. There are a lot of superior neural network models to choose from to train the desired classifiers. Besides concentrating on designing loss functions and neural network architectures, number and quality of dataset are key to using neural networks. In this paper we propose a new method for synthesizing text in natural scene images that takes into account data balance. For each image we obtain regions normal based on depth and regions information. After choosing a text from text resource, we blend the text in the original image by using the homography matrix of original region contours and mask contours where we put text directly in. Especially, the text source is obtained by a specific loss function which reflects the distances of current characters' distribution and target characters' distribution. Text detection experiments on standard dataset ICDAR2015 and augmented dataset demonstrate that our method of balanced synthetic dataset gets an 84.5% F-score which achieves 2% increase than the result of standard dataset and is also higher than synthetic dataset without balance. Training on balanced synthetic datasets achieves great improvement of text recognition than on some public standard recognition datasets and also performs better than synthetic datasets without balance.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] TDI TextSpotter: Taking Data Imbalance into Account in Scene Text Spotting
    Zhou, Yu
    Xie, Hongtao
    Fang, Shancheng
    Wang, Jing
    Zha, Zhengjun
    Zhang, Yongdong
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 2510 - 2518
  • [2] CommuSpotter: Scene Text Spotting with Multi-Task Communication
    Zhao, Liang
    Wilsbacher, Greg
    Wang, Song
    APPLIED SCIENCES-BASEL, 2023, 13 (23):
  • [3] GLASS: Global to Local Attention for Scene-Text Spotting
    Ronen, Roi
    Tsiper, Shahar
    Anschel, Oron
    Lavi, Inbal
    Markovitz, Amir
    Manmatha, R.
    COMPUTER VISION - ECCV 2022, PT XXVIII, 2022, 13688 : 249 - 266
  • [4] Boundary TextSpotter: Toward Arbitrary-Shaped Scene Text Spotting
    Lu, Pu
    Wang, Hao
    Zhu, Shenggao
    Wang, Jing
    Bai, Xiang
    Liu, Wenyu
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 6200 - 6212
  • [5] TTS: Hilbert Transform-Based Generative Adversarial Network for Tattoo and Scene Text Spotting
    Banerjee, Ayan
    Palaiahnakote, Shivakumara
    Pal, Umapada
    Antonacopoulos, Apostolos
    Lu, Tong
    Canet, Josep Llados
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 8226 - 8241
  • [6] Cost-Effective and Smart Text Sensing and Spotting in Blurry Scene Images Using Deep Networks
    Bagi, Randheer
    Dutta, Tanima
    IEEE SENSORS JOURNAL, 2021, 21 (22) : 25307 - 25314
  • [7] TOWARDS ACCURATE INSTANCE-LEVEL TEXT SPOTTING WITH GUIDED ATTENTION
    Wang, Haiyan
    Rong, Xuejian
    Tian, Yingli
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 994 - 999
  • [8] SPTS v2: Single-Point Scene Text Spotting
    Liu, Yuliang
    Zhang, Jiaxin
    Peng, Dezhi
    Huang, Mingxin
    Wang, Xinyu
    Tang, Jingqun
    Huang, Can
    Lin, Dahua
    Shen, Chunhua
    Bai, Xiang
    Jin, Lianwen
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 15665 - 15679
  • [9] Granularity-Aware Single-Point Scene Text Spotting With Sequential Recurrence Self-Attention
    Tong, Xunquan
    Dai, Pengwen
    Qin, Xugong
    Wang, Rui
    Ren, Wenqi
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (12) : 12524 - 12534
  • [10] ABINet plus plus : Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting
    Fang, Shancheng
    Mao, Zhendong
    Xie, Hongtao
    Wang, Yuxin
    Yan, Chenggang
    Zhang, Yongdong
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (06) : 7123 - 7141