HGR-Net: Hierarchical Graph Reasoning Network for Arbitrary Shape Scene Text Detection

被引：9

作者：

Bi, Hengyue ^{[1
]}

Xu, Canhui ^{[1
]}

Shi, Cao ^{[1
]}

Liu, Guozhu ^{[1
]}

Zhang, Honghong ^{[1
]}

Li, Yuteng ^{[1
]}

Dong, Junyu ^{[2
]}

机构：

[1] Qingdao Univ Sci & Technol, Sch Informat Sci & Technol, Qingdao 266100, Peoples R China

[2] Ocean Univ China, Fac Informat Sci & Engn, Qingdao 266100, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2023年 / 32卷

基金：

中国国家自然科学基金;

关键词：

Scene text detection; arbitrary shape text; hierarchical relation modeling; graph convolutional network;

D O I：

10.1109/TIP.2023.3294822

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

As a prerequisite step of scene text reading, scene text detection is known as a challenging task due to natural scene text diversity and variability. Most existing methods either adopt bottom-up sub-text component extraction or focus on top-down text contour regression. From a hybrid perspective, we explore hierarchical text instance-level and component-level representation for arbitrarily-shaped scene text detection. In this work, we propose a novel Hierarchical Graph Reasoning Network (HGR-Net), which consists of a Text Feature Extraction Network (TFEN) and a Text Relation Learner Network (TRLN). TFEN adaptively learns multi-grained text candidates based on shared convolutional feature maps, including instance-level text contours and component-level quadrangles. In TRLN, an inter-text graph is constructed to explore global contextual information with position-awareness between text instances, and an intra-text graph is designed to estimate geometric attributes for establishing component-level linkages. Next, we bridge the cross-feed interaction between instance-level and component-level, and it further achieves hierarchical relational reasoning by learning complementary graph embeddings across levels. Experiments conducted on three publicly available benchmarks SCUT-CTW1500, Total-Text, and ICDAR15 have demonstrated that HGR-Net achieves state-of-the-art performance on arbitrary orientation and arbitrary shape scene text detection.

引用

页码：4142 / 4155

页数：14

共 78 条

[21]

Jaderberg M, 2014, LECT NOTES COMPUT SC, V8692, P512, DOI 10.1007/978-3-319-10593-2_34

[22]

Karatzas D, 2015, PROC INT CONF DOC, P1156, DOI 10.1109/ICDAR.2015.7333942

[23] ICDAR 2013 Robust Reading Competition [J].

Karatzas, Dimosthenis ;

Shafait, Faisal ;

Uchida, Seiichi ;

Iwamura, Masakazu ;

Gomez i Bigorda, Lluis ;

Robles Mestre, Sergi ;

Mas, Joan ;

Fernandez Mota, David ;

Almazan Almazan, Jon ;

Pere de las Heras, Lluis .

2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, :1484-1493

[24] AdaBoost for Text Detection in Natural Scene [J].

Lee, Jung-Jin ;

Lee, Pyoung-Hean ;

Lee, Seong-Whan ;

Yuille, Alan ;

Koch, Christof .

11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, :429-434

[25] DeepGCNs: Can GCNs Go as Deep as CNNs? [J].

Li, Guohao ;

Mueller, Matthias ;

Thabet, Ali ;

Ghanem, Bernard .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9266-9275

[26] Real-Time Scene Text Detection With Differentiable Binarization and Adaptive Scale Fusion [J].

Liao, Minghui ;

Zou, Zhisheng ;

Wan, Zhaoyi ;

Yao, Cong ;

Bai, Xiang .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (01) :919-931

[27]

Liao MH, 2020, AAAI CONF ARTIF INTE, V34, P11474

[28] Rotation-sensitive Regression for Oriented Scene Text Detection [J].

Liao, Minghui ;

Zhu, Zhen ;

Shi, Baoguang ;

Xia, Gui-song ;

Bai, Xiang .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5909-5918

[29] TextBoxes plus plus : A Single-Shot Oriented Scene Text Detector [J].

Liao, Minghui ;

Shi, Baoguang ;

Bai, Xiang .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (08) :3676-3690

[30] Feature Pyramid Networks for Object Detection [J].

Lin, Tsung-Yi ;

Dollar, Piotr ;

Girshick, Ross ;

He, Kaiming ;

Hariharan, Bharath ;

Belongie, Serge .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :936-944

← 1 2 3 4 5 6 7 8 →