Balanced Synthetic Data for Accurate Scene Text Spotting

被引:0
作者
Yao, Ying [1 ]
Huang, Zhangjin [2 ]
机构
[1] Univ Sci & Technol China, Sch Software Engn, Hefei 230051, Anhui, Peoples R China
[2] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei 230027, Anhui, Peoples R China
来源
TENTH INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING (ICDIP 2018) | 2018年 / 10806卷
关键词
synthesize and balance; text detection; text recognition; neural networks;
D O I
10.1117/12.2503258
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
Previous approaches for scene text detection or recognition have already achieved promising performances across various benchmarks. There are a lot of superior neural network models to choose from to train the desired classifiers. Besides concentrating on designing loss functions and neural network architectures, number and quality of dataset are key to using neural networks. In this paper we propose a new method for synthesizing text in natural scene images that takes into account data balance. For each image we obtain regions normal based on depth and regions information. After choosing a text from text resource, we blend the text in the original image by using the homography matrix of original region contours and mask contours where we put text directly in. Especially, the text source is obtained by a specific loss function which reflects the distances of current characters' distribution and target characters' distribution. Text detection experiments on standard dataset ICDAR2015 and augmented dataset demonstrate that our method of balanced synthetic dataset gets an 84.5% F-score which achieves 2% increase than the result of standard dataset and is also higher than synthetic dataset without balance. Training on balanced synthetic datasets achieves great improvement of text recognition than on some public standard recognition datasets and also performs better than synthetic datasets without balance.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] SCENE TEXT DETECTION SUITABLE FOR PARALLELIZING ON MULTI-CORE
    Park, Jin Man
    Chung, Heejin
    Seong, Yeong Kyeong
    2009 16TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-6, 2009, : 2425 - 2428
  • [42] A blind deconvolution model for scene text detection and recognition in video
    Khare, Vijeta
    Shivakumara, Palaiahnakote
    Raveendran, Paramesran
    Blumenstein, Michael
    PATTERN RECOGNITION, 2016, 54 : 128 - 148
  • [43] Scene Text Detection with Inception Text Proposal Generation Module
    Zhang, Hang
    Liu, Jiahang
    Chen, Tieqiao
    ICMLC 2019: 2019 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, 2019, : 456 - 460
  • [44] Unconstrained Scene Text and Video Text Recognition for Arabic Script
    Jain, Mohit
    Mathew, Minesh
    Jawahar, C. V.
    2017 1ST INTERNATIONAL WORKSHOP ON ARABIC SCRIPT ANALYSIS AND RECOGNITION (ASAR), 2017, : 26 - 30
  • [45] Chinese Text Detection Using Deep Learning Model and Synthetic Data
    Gao, Wei-wei
    Zhang, Jun
    Chen, Peng
    Wang, Bing
    Xia, Yi
    INTELLIGENT COMPUTING THEORIES AND APPLICATION, PT I, 2018, 10954 : 503 - 512
  • [46] Scene text detection and recognition: a survey
    Naiemi, Fatemeh
    Ghods, Vahid
    Khalesi, Hassan
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (14) : 20255 - 20290
  • [47] Accurate, data-efficient, unconstrained text recognition with convolutional neural networks
    Yousef, Mohamed
    Hussain, Khaled F.
    Mohammed, Usama S.
    PATTERN RECOGNITION, 2020, 108 (108)
  • [48] Scene text detection and recognition: a survey
    Fatemeh Naiemi
    Vahid Ghods
    Hassan Khalesi
    Multimedia Tools and Applications, 2022, 81 : 20255 - 20290
  • [49] Key-text spotting in documentary videos using Adaboost
    Lalonde, M.
    Gagnon, L.
    IMAGE PROCESSING: ALGORITHMS AND SYSTEMS, NEURAL NETWORKS, AND MACHINE LEARNING, 2006, 6064
  • [50] Towards End-to-End Text Spotting in Natural Scenes
    Wang, Peng
    Li, Hui
    Shen, Chunhua
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (10) : 7266 - 7281