Supervised Attention Network for Arbitrary-Shaped Text Detection in Edge-Fainted Noisy Scene Images

被引:3
作者
Soni, Aishwarya [1 ]
Dutta, Tanima [1 ]
Nigam, Nitika [1 ]
Verma, Deepali [1 ]
Gupta, Hari Prabhat [1 ]
机构
[1] IIT BHU, Dept Comp Sci & Engn, Varanasi 221005, Uttar Pradesh, India
关键词
Image edge detection; Semantics; Feature extraction; Noise measurement; Detectors; Task analysis; Lighting; Deep neural networks; fainted edges; multiattention networks; scene text detection;
D O I
10.1109/TCSS.2022.3153557
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Text mining in noisy situations, like poor contrast and fainted edges, is one of the challenging areas of research in the domain of social networks and computer vision. Scene text detection is a complicated task as text regionhaving varying span in term of size, orientation, aspect ratio, color, font, and script. Furthermore, the contrast of a scene image varies drastically in noisy situations due to poor illumination and image filtering. This faints the text edges and make the task of detection more challenging. In this article, we bring forward a semantic edge supervised spatial-channel attention network, known as SESANet, for detecting arbitrary-shaped text instances in noisy scene images with faint text edges. Our network learns multiscale (MS) supervised edge semantic, pixel-wise spatial structure information, and interchannel dependencies for precisely localizing the text masks in scene images with poor contrast and illumination. Our network is efficient, precise, and fast in nature. SESANet captures rich, dense, discriminative, and MS semantic information. The experimental results show the success of the proposed network. It shows a superior performance with regard to recall on the publicly available benchmark datasets. A new dataset scene images, named as Edge-fainted Noisy Arbitrary-shaped Scene Text (EFNAST) dataset, having varying noise density, poor contrast, low illumination, and faint edges is created.
引用
收藏
页码:1179 / 1188
页数:10
相关论文
共 61 条
[1]  
Animesh Chaitanya, 2017, 2017 IEEE International Conference on Multimedia and Expo: Workshops (ICMEW), P423, DOI 10.1109/ICMEW.2017.8026270
[2]  
[Anonymous], 2017, ARXIV171202170
[3]   Character Region Awareness for Text Detection [J].
Baek, Youngmin ;
Lee, Bado ;
Han, Dongyoon ;
Yun, Sangdoo ;
Lee, Hwalsuk .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :9357-9366
[4]   Met-MLTS: Leveraging Smartphones for End-to-End Spotting of Multilingual Oriented Scene Texts and Traffic Signs in Adverse Meteorological Conditions [J].
Bagi, Randheer ;
Dutta, Tanima ;
Nigam, Nitika ;
Verma, Deepali ;
Gupta, Hari Prabhat .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (08) :12801-12810
[5]   Cost-Effective and Smart Text Sensing and Spotting in Blurry Scene Images Using Deep Networks [J].
Bagi, Randheer ;
Dutta, Tanima .
IEEE SENSORS JOURNAL, 2021, 21 (22) :25307-25314
[6]   Cluttered TextSpotter: An End-to-End Trainable Light-Weight Scene Text Spotter for Cluttered Environment [J].
Bagi, Randheer ;
Dutta, Tanima ;
Gupta, Hari Prabhat .
IEEE ACCESS, 2020, 8 :111433-111447
[7]   Learning Deep Architectures for AI [J].
Bengio, Yoshua .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2009, 2 (01) :1-127
[8]   Disentangled Contour Learning for Quadrilateral Text Detection [J].
Bi, Yanguang ;
Hu, Zhiqiang .
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, :908-917
[9]   Total-Text: A Comprehensive Dataset for Scene Text Detection and Recognition [J].
Ch'ng, Chee Kheng ;
Chan, Chee Seng .
2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, :935-942
[10]  
Chee Kheng Chng, 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR). Proceedings, P1571, DOI 10.1109/ICDAR.2019.00252