RMFPN: End-to-End Scene Text Recognition Using Multi-Feature Pyramid Network

被引：2

作者：

Mahadshetti, Ruturaj ^{[1
]}

Lee, Guee-Sang ^{[1
]}

Choi, Deok-Jai ^{[1
]}

机构：

[1] Chonnam Natl Univ, Dept Artificial Intelligence Convergence, Gwangju 61186, South Korea

来源：

IEEE ACCESS | 2023年 / 11卷

基金：

新加坡国家研究基金会;

关键词：

Text recognition; Semantics; Feature extraction; Visualization; Task analysis; Linguistics; Image recognition; Deep learning; Convolutional neural networks; Scene text recognition; deep learning; convolutional neural network; transformer; multi-feature pyramid network;

D O I：

10.1109/ACCESS.2023.3280547

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Scene text recognition (STR) plays an important role in various computer vision activities. STR has been a desirable research topic in the computer community, and deep learning-based STR methods have gained tremendous outcomes over the past few years. Earlier state-of-the-art scene text recognition approaches even deliver a notable quantity of inaccurate yields when applied to images caught in real-world environments. Because these images lose precise text content information, previous methods generate less robust features and semantic information about text content. To address this issue, we propose a new approach called Residual Multi-Feature Pyramid Network(RMFPN), which integrates ResNet and Multi-Feature Pyramid Networks to grab multi-level relations, enrich the functionality, and generalization of the feature extractor. We build RMFPN with two convolutional pyramids as a feature extractor, which improves the robustness of features and semantic information to endure scene text recognition of various scales. Comprehensive experiments on diverse datasets demonstrate that our proposed method can acquire significant performance accuracy. The proposed RMFPN acquires a 0.61%, 1.2%, 1%, and 0.2% improvement on SVT, IC15, SVTP, and CUTE datasets.

引用

页码：61892 / 61900

页数：9

共 50 条

[1] Feature Fusion Pyramid Network for End-to-End Scene Text Detection
Wu, Yirui
Zhang, Lilai
Li, Hao
Zhang, Yunfei
Wan, Shaohua
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (11)
[2] End-to-End Scene Text Recognition
Wang, Kai
Babenko, Boris
Belongie, Serge
2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2011, : 1457 - 1464
[3] An End-to-End Scene Text Recognition for Bilingual Text
Albalawi, Bayan M.
Jamal, Amani T.
Al Khuzayem, Lama A.
Alsaedi, Olaa A.
BIG DATA AND COGNITIVE COMPUTING, 2024, 8 (09)
[4] An end-to-end model for multi-view scene text recognition
Banerjee, Ayan
Shivakumara, Palaiahnakote
Bhattacharya, Saumik
Pal, Umapada
Liu, Cheng-Lin
PATTERN RECOGNITION, 2024, 149
[5] End-to-end Scene Text Recognition in Videos Based on Multi Frame Tracking
Wang, Xiaobing
Jiang, Yingying
Yang, Shuli
Zhu, Xiangyu
Li, Wei
Fu, Pei
Wang, Hua
Luo, Zhenbo
2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 1255 - 1260
[6] End-to-end scene text recognition using tree-structured models
Shi, Cunzhao
Wang, Chunheng
Xiao, Baihua
Gao, Song
Hu, Jinlong
PATTERN RECOGNITION, 2014, 47 (09) : 2853 - 2866
[7] Transformer-based end-to-end scene text recognition
Zhu, Xinghao
Zhang, Zhi
PROCEEDINGS OF THE 2021 IEEE 16TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA 2021), 2021, : 1691 - 1695
[8] End-to-End Scene Text Recognition with Character Centroid Prediction
Zhao, Wei
Ma, Jinwen
NEURAL INFORMATION PROCESSING (ICONIP 2017), PT III, 2017, 10636 : 291 - 299
[9] EEM: An End-to-end Evaluation Metric for Scene Text Detection and Recognition
Hao, Jiedong
Wen, Yafei
Deng, Jie
Gan, Jun
Ren, Shuai
Tan, Hui
Chen, Xiaoxin
DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT IV, 2021, 12824 : 95 - 108
[10] Person Re-identification with End-to-End Scene Text Recognition
Kamlesh
Xu, Pei
Yang, Yang
Xu, Yongchao
COMPUTER VISION, PT III, 2017, 773 : 363 - 374

← 1 2 3 4 5 →