Scene Text Detection Based on Multi-Dimensional Feature Fusion with Instance-Wise Loss

被引:0
作者
Wu, Qin [1 ]
Zhu, Peiwen [1 ]
Guo, Guodong [2 ]
机构
[1] Jiangnan Univ, Dept Comp Sci, Wuxi 214122, Peoples R China
[2] West Virginia Univ, Dept Comp Sci & Elect Engn, Morgantown, WV 26505 USA
关键词
Scene text detection; boundary refinement branch; multi-dimensional feature fusion; instance-wise loss;
D O I
10.1142/S0218001423530014
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Over the past few years, scene text detection has witnessed rapid progress due to the development of deep neural networks. However, segmentation-based methods may fail to detect pixels near boundary well, and scale variations of text instances may lead to small text missing. To tackle these problems, we propose a novel segmentation-based detector for scene text detection, which can improve the quality of the detected texts. Specifically, a Multi-dimensional Feature Fusion module is used to extract structural and spatial text features from the perspective of height, width and channel, which helps to improve the representation ability of the network. In order to obtain more accurate boundaries of the detected text instances, a Boundary Refinement Branch is introduced to strengthen the supervision for pixels adjacent to boundary. Meanwhile, we propose an Instance-wise Loss to deal with text instances of different scales. Extensive ablation studies validate the effectiveness of these proposed modules. Experiments on several benchmark datasets show that our method achieves better results compared with the state-of-the-art methods.
引用
收藏
页数:22
相关论文
共 45 条
[1]   Character Region Awareness for Text Detection [J].
Baek, Youngmin ;
Lee, Bado ;
Han, Dongyoon ;
Yun, Sangdoo ;
Lee, Hwalsuk .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :9357-9366
[2]   Total-Text: A Comprehensive Dataset for Scene Text Detection and Recognition [J].
Ch'ng, Chee Kheng ;
Chan, Chee Seng .
2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, :935-942
[3]   Deformable Convolutional Networks [J].
Dai, Jifeng ;
Qi, Haozhi ;
Xiong, Yuwen ;
Li, Yi ;
Zhang, Guodong ;
Hu, Han ;
Wei, Yichen .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :764-773
[4]   Progressive Contour Regression for Arbitrary-Shape Scene Text Detection [J].
Dai, Pengwen ;
Zhang, Sanyi ;
Zhang, Hua ;
Cao, Xiaochun .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :7389-7398
[5]  
Deng D, 2018, AAAI CONF ARTIF INTE, P6773
[6]  
Epshtein B, 2010, PROC CVPR IEEE, P2963, DOI 10.1109/CVPR.2010.5540041
[7]   Identity Mappings in Deep Residual Networks [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 :630-645
[8]   MOST: A Multi-Oriented Scene Text Detector with Localization Refinement [J].
He, Minghang ;
Liao, Minghui ;
Yang, Zhibo ;
Zhong, Humen ;
Tang, Jun ;
Cheng, Wenqing ;
Yao, Cong ;
Wang, Yongpan ;
Bai, Xiang .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :8809-8818
[9]  
Karatzas D, 2015, PROC INT CONF DOC, P1156, DOI 10.1109/ICDAR.2015.7333942
[10]   Real-Time Scene Text Detection With Differentiable Binarization and Adaptive Scale Fusion [J].
Liao, Minghui ;
Zou, Zhisheng ;
Wan, Zhaoyi ;
Yao, Cong ;
Bai, Xiang .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (01) :919-931