Real-Time Scene Text Detection with Differentiable Binarization

被引:0
作者
Liao, Minghui [1 ]
Wan, Zhaoyi [2 ]
Yao, Cong [2 ]
Chen, Kai [3 ,4 ]
Bai, Xiang [1 ]
机构
[1] Huazhong Univ Sci & Technol, Wuhan, Hubei, Peoples R China
[2] Megvii, Beijing, Peoples R China
[3] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[4] Onlyou Tech, Shenzhen, Guangdong, Peoples R China
来源
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2020年 / 34卷
基金
国家重点研发计划;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, segmentation-based methods are quite popular in scene text detection, as the segmentation results can more accurately describe scene text of various shapes such as curve text. However, the post-processing of binarization is essential for segmentation-based detection, which converts probability maps produced by a segmentation method into bounding boxes/regions of text. In this paper, we propose a module named Differentiable Binarization (DB), which can perform the binarization process in a segmentation network. Optimized along with a DB module, a segmentation network can adaptively set the thresholds for binarization, which not only simplifies the post-processing but also enhances the performance of text detection. Based on a simple segmentation network, we validate the performance improvements of DB on five benchmark datasets, which consistently achieves state-of-the-art results, in terms of both detection accuracy and speed. In particular, with a light-weight backbone, the performance improvements by DB are significant so that we can look for an ideal tradeoff between detection accuracy and efficiency. Specifically, with a backbone of ResNet-18, our detector achieves an F-measure of 82.8, running at 62 FPS, on the MSRA-TD500 dataset. Code is available at: https://github.com/MhLiao/DB.
引用
收藏
页码:11474 / 11481
页数:8
相关论文
共 50 条
  • [21] Multi-Oriented Real-time Arabic Scene Text Detection with Deep Fully Convolutional Networks
    Sassi, M. Saifeddine Hadj
    Beltaief, Ines
    Zekri, Manel
    Ben Yahia, Sadok
    [J]. 2019 IEEE/ACS 16TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA 2019), 2019,
  • [22] GridMask: An Efficient Scheme for Real Time Curved Scene Text Detection
    Ou, Zhonghong
    Zhang, Yiqun
    Yao, Siyuan
    So, Meina
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VII, 2025, 15037 : 143 - 155
  • [23] An MRF Model for Binarization of Natural Scene Text
    Mishra, Anand
    Alahari, Karteek
    Jawahar, C. V.
    [J]. 11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 11 - 16
  • [24] IDBNet: Improved differentiable binarisation network for natural scene text detection
    Zhang, Zhijia
    Shao, Yiming
    Wang, Ligang
    Li, Haixing
    Liu, Yunpeng
    [J]. IET COMPUTER VISION, 2024, 18 (02) : 224 - 235
  • [25] Real-time traffic sign detection and classification towards real traffic scene
    Wu, Yiqiang
    Li, Zhiyong
    Chen, Ying
    Nai, Ke
    Yuan, Jin
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (25-26) : 18201 - 18219
  • [26] Real-time traffic sign detection and classification towards real traffic scene
    Yiqiang Wu
    Zhiyong Li
    Ying Chen
    Ke Nai
    Jin Yuan
    [J]. Multimedia Tools and Applications, 2020, 79 : 18201 - 18219
  • [27] Real-time shadow detection using multi-channel binarization and noise removal
    Macedo, Marcio C. F.
    Nascimento, Veronica P.
    Souza, Antonio C. S.
    [J]. JOURNAL OF REAL-TIME IMAGE PROCESSING, 2020, 17 (03) : 479 - 492
  • [28] Real-time Scene Change Detection with Object Detection for Automated Stock Verification
    Yedla, Sandeep Kumar
    Manikandan, V. M.
    Panchami, V
    [J]. 2020 5TH INTERNATIONAL CONFERENCE ON DEVICES, CIRCUITS AND SYSTEMS (ICDCS' 20), 2020, : 157 - 161
  • [29] Real-time shadow detection using multi-channel binarization and noise removal
    Márcio C. F. Macedo
    Verônica P. Nascimento
    Antonio C. S. Souza
    [J]. Journal of Real-Time Image Processing, 2020, 17 : 479 - 492
  • [30] MSER-based Real-Time Text Detection and Tracking
    Gomez, Lluis
    Karatzas, Dimosthenis
    [J]. 2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 3110 - 3115