Soft set-based MSER end-to-end system for occluded scene text detection, recognition and prediction

被引:0
|
作者
Das, Alloy [1 ]
Palaiahnakote, Shivakumara [2 ]
Banerjee, Ayan [1 ]
Antonacopoulos, Apostolos [2 ]
Pal, Umapada [1 ]
机构
[1] Indian Stat Inst, Comp Vis & Pattern Recognit Unit, Kolkata, India
[2] Univ Salford, Pattern Recognit & Image Anal PRImA Res Lab, Manchester, England
关键词
Scene text detection; Scene text recognition; Scene text correction; Occluded scene text; Graph neural network; Convolutional recurrent neural network; Convolutional neural network;
D O I
10.1016/j.knosys.2024.112593
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The presence of unpredictable occlusions on natural scene text is a significant challenge, exacerbating the difficulties already posed on text detection and recognition by the variability of such images. Addressing the need for a robust, consistently performing approach that can effectively address the above challenges, this paper presents a new Soft Set-based end-to-end system for text detection, recognition and prediction in occluded natural scene images. This is the first approach to integrate text detection, recognition and prediction, unlike existing systems developed for end-to-end text spotting (text detection and recognition) only. For candidate text components detection, the proposed combination of Soft Sets with Maximally Stable Extremal Regions (SSMSER) improves text detection and spotting in natural scene images, irrespectively of the presence of arbitrarily orientated and shaped text, complex backgrounds and occlusion. Furthermore, a Graph Recurrent Neural Network is proposed for grouping candidate text components into text lines and for fitting accurate bounding boxes to each word. Finally, a Convolutional Recurrent Neural Network (CRNN) is proposed for the recognition of text and for predicting missing characters due to occlusion. Experimental results on a new occluded scene text dataset (OSTD) and on the most relevant benchmark natural scene text datasets demonstrate that the proposed system outperforms the state-of-the-art in text detection, recognition and prediction. The code and dataset are available at https://github.com/alloydas/Softset-MSER-Based-Occluded-Scene-Text-Spotting/blob/master/S oft_set_MSER.ipynb
引用
收藏
页数:19
相关论文
共 50 条
  • [21] End-to-end scene text recognition using tree-structured models
    Shi, Cunzhao
    Wang, Chunheng
    Xiao, Baihua
    Gao, Song
    Hu, Jinlong
    PATTERN RECOGNITION, 2014, 47 (09) : 2853 - 2866
  • [22] ESIR: End-to-end Scene Text Recognition via Iterative Image Rectification
    Zhan, Fangneng
    Lu, Shijian
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 2054 - 2063
  • [23] DiZNet: An end-to-end text detection and recognition algorithm with detail in text zone
    Zhou, Di
    Zhang, Jianxun
    Li, Chao
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 104
  • [24] An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition
    Shi, Baoguang
    Bai, Xiang
    Yao, Cong
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (11) : 2298 - 2304
  • [25] Cursive-Text: A Comprehensive Dataset for End-to-End Urdu Text Recognition in Natural Scene Images
    Chandio, Asghar Ali
    Asikuzzamana, Md.
    Pickering, Mark
    Leghari, Mehwish
    DATA IN BRIEF, 2020, 31
  • [26] An End-to-End Sequence Learning Approach for Text Extraction and Recognition from Scene Image
    Lalitha, G.
    Lavanya, B.
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2022, 22 (07): : 220 - 228
  • [27] OctShuffleMLT: A Compact Octave Based Neural Network for End-to-End Multilingual Text Detection and Recognition
    Lundgren, Antonio
    Castro, Dayvid
    Lima, Estanislau
    Bezerra, Byron
    2019 INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION WORKSHOPS (ICDARW) AND 8TH INTERNATIONAL WORKSHOP ON CAMERA-BASED DOCUMENT ANALYSIS AND RECOGNITION, VOL 4, 2019, : 37 - 42
  • [28] RMFPN: End-to-End Scene Text Recognition Using Multi-Feature Pyramid Network
    Mahadshetti, Ruturaj
    Lee, Guee-Sang
    Choi, Deok-Jai
    IEEE ACCESS, 2023, 11 : 61892 - 61900
  • [29] A Robust Ensemble of ResNets for Character Level End-to-end Text Detection in Natural Scene Images
    Kim, Jinsu
    Kim, Yoonhyung
    Kim, Changick
    PROCEEDINGS OF THE 15TH INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING (CBMI), 2017,
  • [30] Scene text detection using structured information and an end-to-end trainable generative adversarial networks
    Naveen, Palanichamy
    Hassaballah, Mahmoud
    PATTERN ANALYSIS AND APPLICATIONS, 2024, 27 (02)