Region-Aware Arbitrary-Shaped Text Detection With Progressive Fusion

被引：4

作者：

Wang, Qitong ^{[1
,2
]}

Fu, Bin ^{[3
]}

Li, Ming ^{[3
]}

He, Junjun ^{[3
]}

Peng, Xi ^{[2
]}

Qiao, Yu ^{[3
,4
]}

机构：

[1] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen 518055, Peoples R China

[2] Univ Delaware, Dept Comp & Informat Sci, Newark, DE 19716 USA

[3] Chinese Acad Sci, Shenzhen Inst Adv Technol, Guangdong Hong Kong Macao Joint Lab Human Machine, Shenzhen 518055, Peoples R China

[4] Shanghai AI Lab, Shanghai 200031, Peoples R China

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2023年 / 25卷

关键词：

Scene text detection; scene understanding; deep learning;

D O I：

10.1109/TMM.2022.3181448

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Segmentation-based text detectors are flexible to capture arbitrary-shaped text regions. Due to large geometry variance, it is necessary to construct effective and robust representations to identify text regions with various shapes and scales. In this paper, we focus on designing effective multi-scale contextual features for locating text instances. Specially, we develop a Region Context Module (RCM) to summarize the semantic response and adaptively extract text-region-aware information in a limited local area. To construct complementary multi-scale contextual representations, multiple RCM branches with different scales are employed and integrated via Progressive Fusion Module (PFM). Our proposed RCM and PFM serve as the plug-and-play modules which can be incorporated into existing scene text detection platforms to further boost detection performance. Extensive experiments show that our methods achieve state-of-the-art performances on Total-Text, SCUT-CTW1500 and MSRA-TD500 datasets. The code with models will become publicly available at https://github.com/wqtwjt1996/RP-Text.

引用

页码：4718 / 4729

页数：12

共 48 条

[41] Xue CH, 2019, PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P989
[42] A Unified Framework for Multioriented Text Detection and Recognition
Yao, Cong
Bai, Xiang
Liu, Wenyu
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2014, 23 (11) : 4737 - 4749
[43] Yao C, 2012, PROC CVPR IEEE, P1083, DOI 10.1109/CVPR.2012.6247787
[44] Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes
Zhang, Chengquan
Liang, Borong
Huang, Zuming
En, Mengyi
Han, Junyu
Ding, Errui
Ding, Xinghao
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10544 - 10553
[45] Zhang SX, 2020, PROC CVPR IEEE, P9696, DOI 10.1109/CVPR42600.2020.00972
[46] Pyramid Scene Parsing Network
Zhao, Hengshuang
Shi, Jianping
Qi, Xiaojuan
Wang, Xiaogang
Jia, Jiaya
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6230 - 6239
[47] Cascade R-CNN: Delving into High Quality Object Detection
Cai, Zhaowei
Vasconcelos, Nuno
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6154 - 6162
[48] EAST: An Efficient and Accurate Scene Text Detector
Zhou, Xinyu
Yao, Cong
Wen, He
Wang, Yuzhi
Zhou, Shuchang
He, Weiran
Liang, Jiajun
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2642 - 2651

← 1 2 3 4 5 →