End-to-end Scene Text Recognition in Videos Based on Multi Frame Tracking

被引:17
|
作者
Wang, Xiaobing [1 ]
Jiang, Yingying [1 ]
Yang, Shuli [1 ]
Zhu, Xiangyu [1 ]
Li, Wei [1 ]
Fu, Pei [1 ]
Wang, Hua [1 ]
Luo, Zhenbo [1 ]
机构
[1] Samsung R&D Inst China, Machine Learning Lab, Beijing, Peoples R China
关键词
end-to-end text recognition; text in videos; deep neural network; multi frame tracking;
D O I
10.1109/ICDAR.2017.207
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text detection and recognition in scene images and videos attract much attention in computer vision recently. However, most existing text detection and recognition methods only focus on static images. In this paper an end-to-end scene text recognition method based on multi frame tracking is proposed for text in videos, in which temporal information is employed to improve performance. First, an end-to-end text recognition method based on a unified deep neural network is used to detect and recognize text in each frame of the input video. Then, multi frame text tracking is employed through associations of texts in current frame and several previous frames to obtain final results. Experiments on ICDAR datasets demonstrate that the proposed method outperforms the state-of-the-art methods in end-to-end video text recognition.
引用
收藏
页码:1255 / 1260
页数:6
相关论文
共 50 条
  • [1] End-to-End Scene Text Recognition
    Wang, Kai
    Babenko, Boris
    Belongie, Serge
    2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2011, : 1457 - 1464
  • [2] An End-to-End Scene Text Recognition for Bilingual Text
    Albalawi, Bayan M.
    Jamal, Amani T.
    Al Khuzayem, Lama A.
    Alsaedi, Olaa A.
    BIG DATA AND COGNITIVE COMPUTING, 2024, 8 (09)
  • [3] Transformer-based end-to-end scene text recognition
    Zhu, Xinghao
    Zhang, Zhi
    PROCEEDINGS OF THE 2021 IEEE 16TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA 2021), 2021, : 1691 - 1695
  • [4] An end-to-end model for multi-view scene text recognition
    Banerjee, Ayan
    Shivakumara, Palaiahnakote
    Bhattacharya, Saumik
    Pal, Umapada
    Liu, Cheng-Lin
    PATTERN RECOGNITION, 2024, 149
  • [5] Scene text spotting based on end-to-end
    Wei G.
    Rong W.
    Liang Y.
    Xiao X.
    Liu X.
    Journal of Intelligent and Fuzzy Systems, 2021, 40 (05): : 8871 - 8881
  • [6] End-to-End Scene Text Recognition with Character Centroid Prediction
    Zhao, Wei
    Ma, Jinwen
    NEURAL INFORMATION PROCESSING (ICONIP 2017), PT III, 2017, 10636 : 291 - 299
  • [7] RMFPN: End-to-End Scene Text Recognition Using Multi-Feature Pyramid Network
    Mahadshetti, Ruturaj
    Lee, Guee-Sang
    Choi, Deok-Jai
    IEEE ACCESS, 2023, 11 : 61892 - 61900
  • [8] EEM: An End-to-end Evaluation Metric for Scene Text Detection and Recognition
    Hao, Jiedong
    Wen, Yafei
    Deng, Jie
    Gan, Jun
    Ren, Shuai
    Tan, Hui
    Chen, Xiaoxin
    DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT IV, 2021, 12824 : 95 - 108
  • [9] Person Re-identification with End-to-End Scene Text Recognition
    Kamlesh
    Xu, Pei
    Yang, Yang
    Xu, Yongchao
    COMPUTER VISION, PT III, 2017, 773 : 363 - 374
  • [10] End-to-End Analysis for Text Detection and Recognition in Natural Scene Images
    Alnefaie, Ahlam
    Gupta, Deepak
    Bhuyan, Monowar H.
    Razzak, Imran
    Gupta, Prashant
    Prasad, Mukesh
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,