Scene Text Script Identification with Convolutional Recurrent Neural Networks

被引:0
|
作者
Mei, Jieru [1 ]
Dai, Luo [2 ]
Shi, Baoguang [2 ]
Bai, Xiang [2 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Automat, Wuhan 430074, Hubei, Peoples R China
[2] Huazhong Univ Sci & Technol, Sch Elect Informat & Commun, Wuhan 430074, Hubei, Peoples R China
来源
2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) | 2016年
基金
中国国家自然科学基金;
关键词
FEATURES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Script identification for scene text images is a challenging task. This paper describes a novel deep neural network structure that efficiently identifies scripts of images. In our design, we exploit two important factors, namely the image representation, and the spatial dependencies within text lines. To this end, we bring together a Convolutional Neural Network (CNN) and a Recurrent Neural Network (RNN) into one end-to-end trainable network. The former generates rich image representations, while the latter effectively analyzes long-term spatial dependencies. Besides, on top of the structure, we adopt an average pooling structure in order to deal with input images of arbitrary sizes. Experiments on several datasets, including SIW-13 and CVSI2015, demonstrate that our approach achieves superior performance, compared with previous approaches.
引用
收藏
页码:4053 / 4058
页数:6
相关论文
共 50 条
  • [21] Binary patterns encoded convolutional neural networks for texture recognition and remote sensing scene classification
    Anwer, Rao Muhammad
    Khan, Fahad Shahbaz
    van de Weijer, Joost
    Molinier, Matthieu
    Laaksonen, Jorma
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2018, 138 : 74 - 85
  • [22] Improved Audio Scene Classification Based on Label-Tree Embeddings and Convolutional Neural Networks
    Phan, Huy
    Hertel, Lars
    Maass, Marco
    Koch, Philipp
    Mazur, Radoslaw
    Mertins, Alfred
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (06) : 1278 - 1290
  • [23] Articulatory Copy Synthesis Based on the Speech Synthesizer VocalTractLab and Convolutional Recurrent Neural Networks
    Gao, Yingming
    Birkholz, Peter
    Li, Ya
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 1845 - 1858
  • [24] CNNG: A Convolutional Neural Networks With Gated Recurrent Units for Autism Spectrum Disorder Classification
    Jiang, Wenjing
    Liu, Shuaiqi
    Zhang, Hong
    Sun, Xiuming
    Wang, Shui-Hua
    Zhao, Jie
    Yan, Jingwen
    FRONTIERS IN AGING NEUROSCIENCE, 2022, 14
  • [25] Convolutional-Recurrent Neural Networks With Multiple Attention Mechanisms for Speech Emotion Recognition
    Jiang, Pengxu
    Xu, Xinzhou
    Tao, Huawei
    Zhao, Li
    Zou, Cairong
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2022, 14 (04) : 1564 - 1573
  • [26] Convolutional Neural Network Based Text Steganalysis
    Wen, Juan
    Zhou, Xuejing
    Zhong, Ping
    Xue, Yiming
    IEEE SIGNAL PROCESSING LETTERS, 2019, 26 (03) : 460 - 464
  • [27] A Decision-Level Fusion Method Based on Convolutional Neural Networks for Remote Sensing Scene Classification
    Jiang, Bitao
    Li, Xiaobin
    Sun, Tong
    Wang, Shengjin
    PROCEEDINGS OF 2018 IEEE 3RD ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC 2018), 2018, : 128 - 132
  • [28] Do deep convolutional neural networks really need to be deep when applied for remote scene classification?
    Luo, Chang
    Wang, Jie
    Feng, Gang
    Xu, Suhui
    Wang, Shiqiang
    JOURNAL OF APPLIED REMOTE SENSING, 2017, 11
  • [29] Scene Classification of Remotely Sensed Images via Densely Connected Convolutional Neural Networks and an Ensemble Classifier
    Cheng, Qimin
    Xu, Yuan
    Fu, Peng
    Li, Jinling
    Wang, Wei
    Ren, Yingchao
    PHOTOGRAMMETRIC ENGINEERING AND REMOTE SENSING, 2021, 87 (04): : 295 - 308
  • [30] Convolutional Neural Networks for User Identification Based on Motion Sensors Represented as Images
    Benegui, Cezara
    Ionescu, Radu Tudor
    IEEE ACCESS, 2020, 8 : 61255 - 61266