HGR-Net: Hierarchical Graph Reasoning Network for Arbitrary Shape Scene Text Detection

被引：9

作者：

Bi, Hengyue ^{[1
]}

Xu, Canhui ^{[1
]}

Shi, Cao ^{[1
]}

Liu, Guozhu ^{[1
]}

Zhang, Honghong ^{[1
]}

Li, Yuteng ^{[1
]}

Dong, Junyu ^{[2
]}

机构：

[1] Qingdao Univ Sci & Technol, Sch Informat Sci & Technol, Qingdao 266100, Peoples R China

[2] Ocean Univ China, Fac Informat Sci & Engn, Qingdao 266100, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2023年 / 32卷

基金：

中国国家自然科学基金;

关键词：

Scene text detection; arbitrary shape text; hierarchical relation modeling; graph convolutional network;

D O I：

10.1109/TIP.2023.3294822

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

As a prerequisite step of scene text reading, scene text detection is known as a challenging task due to natural scene text diversity and variability. Most existing methods either adopt bottom-up sub-text component extraction or focus on top-down text contour regression. From a hybrid perspective, we explore hierarchical text instance-level and component-level representation for arbitrarily-shaped scene text detection. In this work, we propose a novel Hierarchical Graph Reasoning Network (HGR-Net), which consists of a Text Feature Extraction Network (TFEN) and a Text Relation Learner Network (TRLN). TFEN adaptively learns multi-grained text candidates based on shared convolutional feature maps, including instance-level text contours and component-level quadrangles. In TRLN, an inter-text graph is constructed to explore global contextual information with position-awareness between text instances, and an intra-text graph is designed to estimate geometric attributes for establishing component-level linkages. Next, we bridge the cross-feed interaction between instance-level and component-level, and it further achieves hierarchical relational reasoning by learning complementary graph embeddings across levels. Experiments conducted on three publicly available benchmarks SCUT-CTW1500, Total-Text, and ICDAR15 have demonstrated that HGR-Net achieves state-of-the-art performance on arbitrary orientation and arbitrary shape scene text detection.

引用

页码：4142 / 4155

页数：14

共 78 条

[1] Character Region Awareness for Text Detection [J].

Baek, Youngmin ;

Lee, Bado ;

Han, Dongyoon ;

Yun, Sangdoo ;

Lee, Hwalsuk .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :9357-9366

[2] SRRV: A Novel Document Object Detector Based on Spatial-Related Relation and Vision [J].

Bi, Hengyue ;

Xu, Canhui ;

Shi, Cao ;

Liu, Guozhu ;

Li, Yuteng ;

Zhang, Honghong ;

Qu, Jing .

IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 :3788-3798

[3] End-to-End Object Detection with Transformers [J].

Carion, Nicolas ;

Massa, Francisco ;

Synnaeve, Gabriel ;

Usunier, Nicolas ;

Kirillov, Alexander ;

Zagoruyko, Sergey .

COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229

[4] Total-Text: toward orientation robustness in scene text detection [J].

Ch'ng, Chee-Kheng ;

Chan, Chee Seng ;

Liu, Cheng-Lin .

INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2020, 23 (01) :31-52

[5]

Chen J., 2021, ADV NEUR IN, V34

[6] Iterative Visual Reasoning Beyond Convolutions [J].

Chen, Xinlei ;

Li, Li-Jia ;

Li Fei-Fei ;

Gupta, Abhinav .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7239-7248

[7] ACE: Anchor-Free Corner Evolution for Real-Time Arbitrarily-Oriented Object Detection [J].

Dai, Pengwen ;

Yao, Siyuan ;

Li, Zekun ;

Zhang, Sanyi ;

Cao, Xiaochun .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 :4076-4089

[8] Progressive Contour Regression for Arbitrary-Shape Scene Text Detection [J].

Dai, Pengwen ;

Zhang, Sanyi ;

Zhang, Hua ;

Cao, Xiaochun .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :7389-7398

[9] Accurate Scene Text Detection Via Scale-Aware Data Augmentation and Shape Similarity Constraint [J].

Dai, Pengwen ;

Li, Yang ;

Zhang, Hua ;

Li, Jingzhi ;

Cao, Xiaochun .

IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 :1883-1895

[10] Deep Multi-Scale Context Aware Feature Aggregation for Curved Scene Text Detection [J].

Dai, Pengwen ;

Zhang, Hua ;

Cao, Xiaochun .

IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (08) :1969-1984

← 1 2 3 4 5 6 7 8 →