IDBNet: Improved differentiable binarisation network for natural scene text detection

被引:0
作者
Zhang, Zhijia [1 ]
Shao, Yiming [1 ,2 ]
Wang, Ligang [1 ,2 ]
Li, Haixing [1 ,4 ]
Liu, Yunpeng [3 ]
机构
[1] Shenyang Univ Technol, Coll Artificial Intelligence, Shenyang, Peoples R China
[2] Shenyang Key Lab Informat Percept & Edge Comp, Shenyang, Peoples R China
[3] Chinese Acad Sci, Shenyang Inst Automat, Shenyang, Peoples R China
[4] Shenyang Univ Technol, Coll Artificial Intelligence, Shenyang 110870, Peoples R China
关键词
computer vision; natural scenes; text detection;
D O I
10.1049/cvi2.12241
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The text in the natural scene can express rich semantic information, which helps people understand and analyse daily things. This paper focuses on the problems of discrete text spatial distribution and variable text geometric size in natural scenes with complex backgrounds and proposes an end-to-end natural scene text detection method based on DBNet. The authors first use IResNet as the backbone network, which does not increase network parameters while retaining more text features. Furthermore, a module with Transformer is introduced in the feature extraction stage to strengthen the correlation between high-level feature pixels. Then, the authors add a spatial pyramid pooling structure in the end of feature extraction, which realises the combination of local and global features, enriches the expressive ability of feature maps, and alleviates the detection limitations caused by the geometric size of features. Finally, to better integrate the features of each level, a dual attention module is embedded after multi-scale feature fusion. Extensive experiments on the MSRA-TD500, CTW1500, ICDAR2015, and MLT2017 data set are conducted. The results showed that IDBNet can improve the average precision, recall, and F-measure of a text compared with the state of art text detection methods and has higher predictive ability and practicability. This paper designs an improved natural text detection algorithm based on DBNet to solve the problems of discrete spatial distribution of text and diversity of a text geometric size. On multiple public text detection datasets, the F-measure of the proposed method achieves advanced detection performance, which fully verifies the effectiveness of the proposed method.image
引用
收藏
页码:224 / 235
页数:12
相关论文
共 47 条
[41]  
Ye J, 2020, PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P516
[42]   Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes [J].
Zhang, Chengquan ;
Liang, Borong ;
Huang, Zuming ;
En, Mengyi ;
Han, Junyu ;
Ding, Errui ;
Ding, Xinghao .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :10544-10553
[43]  
Zhang S.X., 2020, Deep relational reasoning graph network for arbitrary shape text detection, DOI 10.48550ARXIV.2003.07493
[44]   Deep Relational Reasoning Graph Network for Arbitrary Shape Text Detection [J].
Zhang, Shi-Xue ;
Zhu, Xiaobin ;
Hou, Jie-Bo ;
Liu, Chang ;
Yang, Chun ;
Wang, Hongfa ;
Yin, Xu-Cheng .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :9696-9705
[45]   EAST: An Efficient and Accurate Scene Text Detector [J].
Zhou, Xinyu ;
Yao, Cong ;
Wen, He ;
Wang, Yuzhi ;
Zhou, Shuchang ;
He, Weiran ;
Liang, Jiajun .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2642-2651
[46]   Deformable ConvNets v2: More Deformable, Better Results [J].
Zhu, Xizhou ;
Hu, Han ;
Lin, Stephen ;
Dai, Jifeng .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :9300-9308
[47]   Fourier Contour Embedding for Arbitrary-Shaped Text Detection [J].
Zhu, Yiqin ;
Chen, Jianyong ;
Liang, Lingyu ;
Kuang, Zhanghui ;
Jin, Lianwen ;
Zhang, Wayne .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :3122-3130