Real-Time Scene Text Detection with Differentiable Binarization

被引:0
|
作者
Liao, Minghui [1 ]
Wan, Zhaoyi [2 ]
Yao, Cong [2 ]
Chen, Kai [3 ,4 ]
Bai, Xiang [1 ]
机构
[1] Huazhong Univ Sci & Technol, Wuhan, Hubei, Peoples R China
[2] Megvii, Beijing, Peoples R China
[3] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[4] Onlyou Tech, Shenzhen, Guangdong, Peoples R China
来源
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2020年 / 34卷
基金
国家重点研发计划;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, segmentation-based methods are quite popular in scene text detection, as the segmentation results can more accurately describe scene text of various shapes such as curve text. However, the post-processing of binarization is essential for segmentation-based detection, which converts probability maps produced by a segmentation method into bounding boxes/regions of text. In this paper, we propose a module named Differentiable Binarization (DB), which can perform the binarization process in a segmentation network. Optimized along with a DB module, a segmentation network can adaptively set the thresholds for binarization, which not only simplifies the post-processing but also enhances the performance of text detection. Based on a simple segmentation network, we validate the performance improvements of DB on five benchmark datasets, which consistently achieves state-of-the-art results, in terms of both detection accuracy and speed. In particular, with a light-weight backbone, the performance improvements by DB are significant so that we can look for an ideal tradeoff between detection accuracy and efficiency. Specifically, with a backbone of ResNet-18, our detector achieves an F-measure of 82.8, running at 62 FPS, on the MSRA-TD500 dataset. Code is available at: https://github.com/MhLiao/DB.
引用
收藏
页码:11474 / 11481
页数:8
相关论文
共 50 条
  • [1] Real-Time Scene Text Detection With Differentiable Binarization and Adaptive Scale Fusion
    Liao, Minghui
    Zou, Zhisheng
    Wan, Zhaoyi
    Yao, Cong
    Bai, Xiang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (01) : 919 - 931
  • [2] An Improved Differentiable Binarization Network for Natural Scene Street Sign Text Detection
    Lu, Manhuai
    Leng, Yi
    Chen, Chin-Ling
    Tang, Qiting
    APPLIED SCIENCES-BASEL, 2022, 12 (23):
  • [3] Text kernel expansion for real-time scene text detection
    He, Tao
    Huang, Sheng
    Tang, Wenhao
    Liu, Bo
    PATTERN ANALYSIS AND APPLICATIONS, 2024, 27 (04)
  • [4] Real-time Scene Text Detection Based on Stroke Model
    Liu, Yi
    Zhang, Dongming
    Zhang, Yongdong
    Lin, Shouxun
    2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 3116 - 3120
  • [5] Real-Time Scene Text Localization and Recognition
    Neumann, Lukas
    Matas, Jiri
    2012 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2012, : 3538 - 3545
  • [6] A Real-Time Scene Text to Speech System
    Neumann, Lukas
    Matas, Jiri
    COMPUTER VISION - ECCV 2012, PT III, 2012, 7585 : 619 - 622
  • [7] BINARIZATION BASED IMPLEMENTATION FOR REAL-TIME HUMAN DETECTION
    Xie, Shuai
    Li, Yibin
    Jia, Zhiping
    Ju, Lei
    2013 23RD INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL 2013) PROCEEDINGS, 2013,
  • [8] A Real-Time Scene Uyghur Text Detection Network Based on Feature Complementation
    Ibrayim, Mayire
    Chen, Mengmeng
    Hamdulla, Askar
    Kane, Jianjun
    Zhang, Chunhu
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT V, 2024, 14808 : 59 - 75
  • [9] An Improved System For Real-Time Scene Text Recognition
    Yang, Haojin
    Wang, Cheng
    Che, Xiaoyin
    Luo, Sheng
    Meinel, Christoph
    ICMR'15: PROCEEDINGS OF THE 2015 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2015, : 657 - 660
  • [10] EK-NET:REAL-TIME SCENE TEXT DETECTION WITH EXPAND KERNEL DISTANCE
    Zhu, Boyuan
    Liu, Fagui
    Chen, Xi
    Tang, Quan
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 6380 - 6384