Kernel Proposal Network for Arbitrary Shape Text Detection

被引：24

作者：

Zhang, Shi-Xue ^{[1
]}

Zhu, Xiaobin ^{[1
]}

Hou, Jie-Bo ^{[1
]}

Yang, Chun ^{[1
]}

Yin, Xu-Cheng ^{[1
,2
,3
]}

机构：

[1] Univ Sci & Technol Beijing, Sch Comp & Commun Engn, Beijing 100083, Peoples R China

[2] Univ Sci & Technol Beijing, Inst Artificial Intelligence, Beijing 100083, Peoples R China

[3] USTB EEasyTech, Joint Lab Artificial Intelligence, Beijing 100083, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2022年 / 34卷 / 11期

基金：

中国国家自然科学基金;

关键词：

Kernel; Proposals; Shape; Feature extraction; Convolution; Adaptation models; Image segmentation; Arbitrary shape text detection; deep neural network; dynamic convolution kernel; kernel proposal;

D O I：

10.1109/TNNLS.2022.3152596

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Segmentation-based methods have achieved great success for arbitrary shape text detection. However, separating neighboring text instances is still one of the most challenging problems due to the complexity of texts in scene images. In this article, we propose an innovative kernel proposal network (dubbed KPN) for arbitrary shape text detection. The proposed KPN can separate neighboring text instances by classifying different texts into instance-independent feature maps, meanwhile avoiding the complex aggregation process existing in segmentation-based arbitrary shape text detection methods. To be concrete, our KPN will predict a Gaussian center map for each text image, which will be used to extract a series of candidate kernel proposals (i.e., dynamic convolution kernel) from the embedding feature maps according to their corresponding keypoint positions. To enforce the independence between kernel proposals, we propose a novel orthogonal learning loss (OLL) via orthogonal constraints. Specifically, our kernel proposals contain important self-information learned by network and location information by position embedding. Finally, kernel proposals will individually convolve all embedding feature maps for generating individual embedded maps of text instances. In this way, our KPN can effectively separate neighboring text instances and improve the robustness against unclear boundaries. To the best of our knowledge, our work is the first to introduce the dynamic convolution kernel strategy to efficiently and effectively tackle the adhesion problem of neighboring text instances in text detection. Experimental results on challenging datasets verify the impressive performance and efficiency of our method. The code and model are available at https://github.com/GXYM/KPN.

引用

页码：8731 / 8742

页数：12

共 62 条

[1] Character Region Awareness for Text Detection
Baek, Youngmin
Lee, Bado
Han, Dongyoon
Yun, Sangdoo
Lee, Hwalsuk
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9357 - 9366
[2] Total-Text: A Comprehensive Dataset for Scene Text Detection and Recognition
Ch'ng, Chee Kheng
Chan, Chee Seng
[J]. 2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 935 - 942
[3] Dai P., 2021, P IEEECVF C COMPUTER, P7393
[4] Dai YC, 2018, INT C PATT RECOG, P3604, DOI 10.1109/ICPR.2018.8546066
[5] Deng D, 2018, AAAI CONF ARTIF INTE, P6773
[6] CenterNet: Keypoint Triplets for Object Detection
Duan, Kaiwen
Bai, Song
Xie, Lingxi
Qi, Honggang
Huang, Qingming
Tian, Qi
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6568 - 6577
[7] TextDragon: An End-to-End Framework for Arbitrary Shaped Text Spotting
Feng, Wei
He, Wenhao
Yin, Fei
Zhang, Xu-Yao
Liu, Cheng-Lin
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9075 - 9084
[8] He KM, 2017, IEEE I CONF COMP VIS, P2980, DOI [10.1109/TPAMI.2018.2844175, 10.1109/ICCV.2017.322]
[9] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[10] He Q., IEEE T GEOSCI ELECT, V60, P2022

← 1 2 3 4 5 6 7 →