IDBNet: Improved differentiable binarisation network for natural scene text detection

被引:0
作者
Zhang, Zhijia [1 ]
Shao, Yiming [1 ,2 ]
Wang, Ligang [1 ,2 ]
Li, Haixing [1 ,4 ]
Liu, Yunpeng [3 ]
机构
[1] Shenyang Univ Technol, Coll Artificial Intelligence, Shenyang, Peoples R China
[2] Shenyang Key Lab Informat Percept & Edge Comp, Shenyang, Peoples R China
[3] Chinese Acad Sci, Shenyang Inst Automat, Shenyang, Peoples R China
[4] Shenyang Univ Technol, Coll Artificial Intelligence, Shenyang 110870, Peoples R China
关键词
computer vision; natural scenes; text detection;
D O I
10.1049/cvi2.12241
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The text in the natural scene can express rich semantic information, which helps people understand and analyse daily things. This paper focuses on the problems of discrete text spatial distribution and variable text geometric size in natural scenes with complex backgrounds and proposes an end-to-end natural scene text detection method based on DBNet. The authors first use IResNet as the backbone network, which does not increase network parameters while retaining more text features. Furthermore, a module with Transformer is introduced in the feature extraction stage to strengthen the correlation between high-level feature pixels. Then, the authors add a spatial pyramid pooling structure in the end of feature extraction, which realises the combination of local and global features, enriches the expressive ability of feature maps, and alleviates the detection limitations caused by the geometric size of features. Finally, to better integrate the features of each level, a dual attention module is embedded after multi-scale feature fusion. Extensive experiments on the MSRA-TD500, CTW1500, ICDAR2015, and MLT2017 data set are conducted. The results showed that IDBNet can improve the average precision, recall, and F-measure of a text compared with the state of art text detection methods and has higher predictive ability and practicability. This paper designs an improved natural text detection algorithm based on DBNet to solve the problems of discrete spatial distribution of text and diversity of a text geometric size. On multiple public text detection datasets, the F-measure of the proposed method achieves advanced detection performance, which fully verifies the effectiveness of the proposed method.image
引用
收藏
页码:224 / 235
页数:12
相关论文
共 47 条
  • [1] Character Region Awareness for Text Detection
    Baek, Youngmin
    Lee, Bado
    Han, Dongyoon
    Yun, Sangdoo
    Lee, Hwalsuk
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9357 - 9366
  • [2] Chen Z., 2021, FAST FASTER ARBITRAR
  • [3] Dai Y., 2018, 2018 24th International Conference on Pattern Recognition (ICPR)
  • [4] Duta I.C., 2021, 2020 25 INT C PATT R
  • [5] Epshtein B, 2010, PROC CVPR IEEE, P2963, DOI 10.1109/CVPR.2010.5540041
  • [6] Dual Attention Network for Scene Segmentation
    Fu, Jun
    Liu, Jing
    Tian, Haijie
    Li, Yong
    Bao, Yongjun
    Fang, Zhiwei
    Lu, Hanqing
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3141 - 3149
  • [7] Synthetic Data for Text Localisation in Natural Images
    Gupta, Ankush
    Vedaldi, Andrea
    Zisserman, Andrew
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2315 - 2324
  • [8] Identity Mappings in Deep Residual Networks
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 : 630 - 645
  • [9] Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (09) : 1904 - 1916
  • [10] He KM, 2014, LECT NOTES COMPUT SC, V8691, P346, DOI [arXiv:1406.4729, 10.1007/978-3-319-10578-9_23]