Multi-Spectral Fusion Based Approach for Arbitrarily Oriented Scene Text Detection in Video Images

被引:42
|
作者
Liang, Guozhu [1 ]
Shivakumara, Palaiahnakote [2 ]
Lu, Tong [1 ]
Tan, Chew Lim [3 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Jiangsu, Peoples R China
[2] Univ Malaya, Fac Comp Sci & Informat Technol, Kuala Lumpur 50603, Malaysia
[3] Natl Univ Singapore, Sch Comp, Singapore 119077, Singapore
基金
美国国家科学基金会;
关键词
Laplacian-wavelet; multi spectral fusion; maxima stable extreme regions; stroke width transform; arbitrarily oriented video text detection; EXTRACTION; GRADIENT; RECOGNITION; SCHEME;
D O I
10.1109/TIP.2015.2465169
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scene text detection from video as well as natural scene images is challenging due to the variations in background, contrast, text type, font type, font size, and so on. Besides, arbitrary orientations of texts with multi-scripts add more complexity to the problem. The proposed approach introduces a new idea of convolving Laplacian with wavelet sub-bands at different levels in the frequency domain for enhancing low resolution text pixels. Then, the results obtained from different sub-bands (spectral) are fused for detecting candidate text pixels. We explore maxima stable extreme regions along with stroke width transform for detecting candidate text regions. Text alignment is done based on the distance between the nearest neighbor clusters of candidate text regions. In addition, the approach presents a new symmetry driven nearest neighbor for restoring full text lines. We conduct experiments on our collected video data as well as several benchmark data sets, such as ICDAR 2011, ICDAR 2013, and MSRA-TD500 to evaluate the proposed method. The proposed approach is compared with the state-of-the-art methods to show its superiority to the existing methods.
引用
收藏
页码:4488 / 4501
页数:14
相关论文
共 50 条
  • [1] Gradient Vector Flow and Grouping-based Method for Arbitrarily Oriented Scene Text Detection in Video Images
    Shivakumara, Palaiahnakote
    Trung Quy Phan
    Lu, Shijian
    Tan, Chew Lim
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2013, 23 (10) : 1729 - 1739
  • [2] Multi-oriented text detection and verification in video frames and scene images
    Sain, Aneeshan
    Bhunia, Ayan Kumar
    Roy, Partha Pratim
    Pal, Umapada
    NEUROCOMPUTING, 2018, 275 : 1531 - 1549
  • [3] MULTI-ORIENTED TEXT DETECTION IN SCENE IMAGES
    Basavanna, M.
    Shivakumara, P.
    Srivatsa, S. K.
    Kumar, G. Hemantha
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2012, 26 (07)
  • [4] Arbitrarily-Oriented Text Detection in Low Light Natural Scene Images
    Xue, Minglong
    Shivakumara, Palaiahnakote
    Zhang, Chao
    Xiao, Yao
    Lu, Tong
    Pal, Umapada
    Lopresti, Daniel
    Yang, Zhibo
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 2706 - 2720
  • [5] Arbitrarily-oriented multi-lingual text detection in video
    Khare, Vijeta
    Shivakumara, Palaiahnakote
    Paramesran, Raveendran
    Blumenstein, Michael
    MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (15) : 16625 - 16655
  • [6] Arbitrarily-oriented multi-lingual text detection in video
    Vijeta Khare
    Palaiahnakote Shivakumara
    Raveendran Paramesran
    Michael Blumenstein
    Multimedia Tools and Applications, 2017, 76 : 16625 - 16655
  • [7] Multi-Script-Oriented Text Detection and Recognition in Video/Scene/Born Digital Images
    Raghunandan, K. S.
    Shivakumara, Palaiahnakote
    Roy, Sangheeta
    Kumar, G. Hemantha
    Pal, Umapada
    Lu, Tong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (04) : 1145 - 1162
  • [8] Region-based Image Fusion Approach of Panchromatic and Multi-spectral Images
    Gharbia, Reham
    El Baz, Ali Hassan
    Hassanien, Aboul Ella
    Snasel, Vaclav
    INTELLIGENT DATA ANALYSIS AND APPLICATIONS, 2015, 370 : 535 - 545
  • [9] Spectral unmixing based fusion algorithm for hyperspectral and multi-spectral images
    Zhao, Chunhui
    Zhang, Hongyu
    PROCEEDINGS OF 2016 IEEE 13TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP 2016), 2016, : 772 - 776
  • [10] A New Multi-spectral Fusion Method for Degraded Video Text Frame Enhancement
    Weng, Yangbing
    Shivakumara, Palaiahnakote
    Lu, Tong
    Meng, Liang Kim
    Woon, Hon Hock
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2015, PT I, 2015, 9314 : 495 - 506