Region-Aware Arbitrary-Shaped Text Detection With Progressive Fusion

被引：4

作者：

Wang, Qitong ^{[1
,2
]}

Fu, Bin ^{[3
]}

Li, Ming ^{[3
]}

He, Junjun ^{[3
]}

Peng, Xi ^{[2
]}

Qiao, Yu ^{[3
,4
]}

机构：

[1] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen 518055, Peoples R China

[2] Univ Delaware, Dept Comp & Informat Sci, Newark, DE 19716 USA

[3] Chinese Acad Sci, Shenzhen Inst Adv Technol, Guangdong Hong Kong Macao Joint Lab Human Machine, Shenzhen 518055, Peoples R China

[4] Shanghai AI Lab, Shanghai 200031, Peoples R China

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2023年 / 25卷

关键词：

Scene text detection; scene understanding; deep learning;

D O I：

10.1109/TMM.2022.3181448

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Segmentation-based text detectors are flexible to capture arbitrary-shaped text regions. Due to large geometry variance, it is necessary to construct effective and robust representations to identify text regions with various shapes and scales. In this paper, we focus on designing effective multi-scale contextual features for locating text instances. Specially, we develop a Region Context Module (RCM) to summarize the semantic response and adaptively extract text-region-aware information in a limited local area. To construct complementary multi-scale contextual representations, multiple RCM branches with different scales are employed and integrated via Progressive Fusion Module (PFM). Our proposed RCM and PFM serve as the plug-and-play modules which can be incorporated into existing scene text detection platforms to further boost detection performance. Extensive experiments show that our methods achieve state-of-the-art performances on Total-Text, SCUT-CTW1500 and MSRA-TD500 datasets. The code with models will become publicly available at https://github.com/wqtwjt1996/RP-Text.

引用

页码：4718 / 4729

页数：12

共 48 条

[1] Character Region Awareness for Text Detection
Baek, Youngmin
Lee, Bado
Han, Dongyoon
Yun, Sangdoo
Lee, Hwalsuk
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9357 - 9366
[2] Total-Text: toward orientation robustness in scene text detection
Ch'ng, Chee-Kheng
Chan, Chee Seng
Liu, Cheng-Lin
[J]. INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2020, 23 (01) : 31 - 52
[3] Chen LB, 2017, IEEE INT SYMP NANO, P1, DOI 10.1109/NANOARCH.2017.8053709
[4] Chung J., 2014, NEURAL INFORM PROCES
[5] Dauphin YN, 2017, PR MACH LEARN RES, V70
[6] Deng D, 2018, AAAI CONF ARTIF INTE, P6773
[7] Adaptive Context Network for Scene Parsing
Fu, Jun
Liu, Jing
Wang, Yuhang
Li, Yong
Bao, Yongjun
Tang, Jinhui
Lu, Hanqing
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6747 - 6756
[8] Synthetic Data for Text Localisation in Natural Images
Gupta, Ankush
Vedaldi, Andrea
Zisserman, Andrew
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2315 - 2324
[9] He KM, 2020, IEEE T PATTERN ANAL, V42, P386, DOI [10.1109/ICCV.2017.322, 10.1109/TPAMI.2018.2844175]
[10] Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]

← 1 2 3 4 5 →