Balanced Synthetic Data for Accurate Scene Text Spotting

被引:0
作者
Yao, Ying [1 ]
Huang, Zhangjin [2 ]
机构
[1] Univ Sci & Technol China, Sch Software Engn, Hefei 230051, Anhui, Peoples R China
[2] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei 230027, Anhui, Peoples R China
来源
TENTH INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING (ICDIP 2018) | 2018年 / 10806卷
关键词
synthesize and balance; text detection; text recognition; neural networks;
D O I
10.1117/12.2503258
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
Previous approaches for scene text detection or recognition have already achieved promising performances across various benchmarks. There are a lot of superior neural network models to choose from to train the desired classifiers. Besides concentrating on designing loss functions and neural network architectures, number and quality of dataset are key to using neural networks. In this paper we propose a new method for synthesizing text in natural scene images that takes into account data balance. For each image we obtain regions normal based on depth and regions information. After choosing a text from text resource, we blend the text in the original image by using the homography matrix of original region contours and mask contours where we put text directly in. Especially, the text source is obtained by a specific loss function which reflects the distances of current characters' distribution and target characters' distribution. Text detection experiments on standard dataset ICDAR2015 and augmented dataset demonstrate that our method of balanced synthetic dataset gets an 84.5% F-score which achieves 2% increase than the result of standard dataset and is also higher than synthetic dataset without balance. Training on balanced synthetic datasets achieves great improvement of text recognition than on some public standard recognition datasets and also performs better than synthetic datasets without balance.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Arbitrarily Shaped Scene Text Detection With a Mask Tightness Text Detector
    Liu, Yuliang
    Jin, Lianwen
    Fang, Chuanming
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 2918 - 2930
  • [22] Convolutional Attention Networks for Scene Text Recognition
    Xie, Hongtao
    Fang, Shancheng
    Zha, Zheng-Jun
    Yang, Yating
    Li, Yan
    Zhang, Yongdong
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2019, 15 (01)
  • [23] Text detection and restoration in natural scene images
    Ye, Qixiang
    Hao, Jianbin
    Huang, Jun
    Yu, Hua
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2007, 18 (06) : 504 - 513
  • [24] Turning a CLIP Model Into a Scene Text Spotter
    Yu, Wenwen
    Liu, Yuliang
    Zhu, Xingkui
    Cao, Haoyu
    Sun, Xing
    Bai, Xiang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (09) : 6040 - 6054
  • [25] Text Detection and Recognition in Natural Scene Images
    Pise, Amruta
    Ruikar, S. D.
    2014 INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND SIGNAL PROCESSING (ICCSP), 2014,
  • [26] A Review: Text Detection in Natural Scene Image
    Sun, Yue
    Dawut, Abdusalam
    Hamdulla, Askar
    2018 3RD INTERNATIONAL CONFERENCE ON SMART CITY AND SYSTEMS ENGINEERING (ICSCSE), 2018, : 826 - 829
  • [27] A Bilingual, Open World Video Text Dataset and Real-Time Video Text Spotting With Contrastive Learning
    Wu, Weijia
    Li, Zhuang
    Cai, Yuanqiang
    Zhou, Hong
    Zheng Shou, Mike
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (01) : 534 - 546
  • [28] Rethinking text rectification for scene text recognition
    Ke, Wenjun
    Wei, Jianguo
    Hou, Qingzhi
    Feng, Hui
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 219
  • [29] Urdu-Text Detection and Recognition in Natural Scene Images Using Deep Learning
    Arafat, Syed Yasser
    Iqbal, Muhammad Javed
    IEEE ACCESS, 2020, 8 : 96787 - 96803
  • [30] Research on the Text Detection and Recognition in Natural Scene Images
    Wei Zi-han
    Du Xiao-ping
    Cao Lei
    ELEVENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2019), 2020, 11373