Region-Aware Arbitrary-Shaped Text Detection With Progressive Fusion

被引:4
作者
Wang, Qitong [1 ,2 ]
Fu, Bin [3 ]
Li, Ming [3 ]
He, Junjun [3 ]
Peng, Xi [2 ]
Qiao, Yu [3 ,4 ]
机构
[1] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen 518055, Peoples R China
[2] Univ Delaware, Dept Comp & Informat Sci, Newark, DE 19716 USA
[3] Chinese Acad Sci, Shenzhen Inst Adv Technol, Guangdong Hong Kong Macao Joint Lab Human Machine, Shenzhen 518055, Peoples R China
[4] Shanghai AI Lab, Shanghai 200031, Peoples R China
关键词
Scene text detection; scene understanding; deep learning;
D O I
10.1109/TMM.2022.3181448
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Segmentation-based text detectors are flexible to capture arbitrary-shaped text regions. Due to large geometry variance, it is necessary to construct effective and robust representations to identify text regions with various shapes and scales. In this paper, we focus on designing effective multi-scale contextual features for locating text instances. Specially, we develop a Region Context Module (RCM) to summarize the semantic response and adaptively extract text-region-aware information in a limited local area. To construct complementary multi-scale contextual representations, multiple RCM branches with different scales are employed and integrated via Progressive Fusion Module (PFM). Our proposed RCM and PFM serve as the plug-and-play modules which can be incorporated into existing scene text detection platforms to further boost detection performance. Extensive experiments show that our methods achieve state-of-the-art performances on Total-Text, SCUT-CTW1500 and MSRA-TD500 datasets. The code with models will become publicly available at https://github.com/wqtwjt1996/RP-Text.
引用
收藏
页码:4718 / 4729
页数:12
相关论文
共 48 条
  • [41] Xue CH, 2019, PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P989
  • [42] A Unified Framework for Multioriented Text Detection and Recognition
    Yao, Cong
    Bai, Xiang
    Liu, Wenyu
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2014, 23 (11) : 4737 - 4749
  • [43] Yao C, 2012, PROC CVPR IEEE, P1083, DOI 10.1109/CVPR.2012.6247787
  • [44] Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes
    Zhang, Chengquan
    Liang, Borong
    Huang, Zuming
    En, Mengyi
    Han, Junyu
    Ding, Errui
    Ding, Xinghao
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10544 - 10553
  • [45] Zhang SX, 2020, PROC CVPR IEEE, P9696, DOI 10.1109/CVPR42600.2020.00972
  • [46] Pyramid Scene Parsing Network
    Zhao, Hengshuang
    Shi, Jianping
    Qi, Xiaojuan
    Wang, Xiaogang
    Jia, Jiaya
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6230 - 6239
  • [47] Cascade R-CNN: Delving into High Quality Object Detection
    Cai, Zhaowei
    Vasconcelos, Nuno
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6154 - 6162
  • [48] EAST: An Efficient and Accurate Scene Text Detector
    Zhou, Xinyu
    Yao, Cong
    Wen, He
    Wang, Yuzhi
    Zhou, Shuchang
    He, Weiran
    Liang, Jiajun
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2642 - 2651