Scene Text Script Identification with Convolutional Recurrent Neural Networks

被引：0

作者：

Mei, Jieru ^{[1
]}

Dai, Luo ^{[2
]}

Shi, Baoguang ^{[2
]}

Bai, Xiang ^{[2
]}

机构：

[1] Huazhong Univ Sci & Technol, Sch Automat, Wuhan 430074, Hubei, Peoples R China

[2] Huazhong Univ Sci & Technol, Sch Elect Informat & Commun, Wuhan 430074, Hubei, Peoples R China

来源：

2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) | 2016年

基金：

中国国家自然科学基金;

关键词：

FEATURES;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Script identification for scene text images is a challenging task. This paper describes a novel deep neural network structure that efficiently identifies scripts of images. In our design, we exploit two important factors, namely the image representation, and the spatial dependencies within text lines. To this end, we bring together a Convolutional Neural Network (CNN) and a Recurrent Neural Network (RNN) into one end-to-end trainable network. The former generates rich image representations, while the latter effectively analyzes long-term spatial dependencies. Besides, on top of the structure, we adopt an average pooling structure in order to deal with input images of arbitrary sizes. Experiments on several datasets, including SIW-13 and CVSI2015, demonstrate that our approach achieves superior performance, compared with previous approaches.

引用

页码：4053 / 4058

页数：6

共 50 条

[21] Binary patterns encoded convolutional neural networks for texture recognition and remote sensing scene classification
Anwer, Rao Muhammad
Khan, Fahad Shahbaz
van de Weijer, Joost
Molinier, Matthieu
Laaksonen, Jorma
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2018, 138 : 74 - 85
[22] Improved Audio Scene Classification Based on Label-Tree Embeddings and Convolutional Neural Networks
Phan, Huy
Hertel, Lars
Maass, Marco
Koch, Philipp
Mazur, Radoslaw
Mertins, Alfred
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (06) : 1278 - 1290
[23] Articulatory Copy Synthesis Based on the Speech Synthesizer VocalTractLab and Convolutional Recurrent Neural Networks
Gao, Yingming
Birkholz, Peter
Li, Ya
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 1845 - 1858
[24] CNNG: A Convolutional Neural Networks With Gated Recurrent Units for Autism Spectrum Disorder Classification
Jiang, Wenjing
Liu, Shuaiqi
Zhang, Hong
Sun, Xiuming
Wang, Shui-Hua
Zhao, Jie
Yan, Jingwen
FRONTIERS IN AGING NEUROSCIENCE, 2022, 14
[25] Convolutional-Recurrent Neural Networks With Multiple Attention Mechanisms for Speech Emotion Recognition
Jiang, Pengxu
Xu, Xinzhou
Tao, Huawei
Zhao, Li
Zou, Cairong
IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2022, 14 (04) : 1564 - 1573
[26] Convolutional Neural Network Based Text Steganalysis
Wen, Juan
Zhou, Xuejing
Zhong, Ping
Xue, Yiming
IEEE SIGNAL PROCESSING LETTERS, 2019, 26 (03) : 460 - 464
[27] A Decision-Level Fusion Method Based on Convolutional Neural Networks for Remote Sensing Scene Classification
Jiang, Bitao
Li, Xiaobin
Sun, Tong
Wang, Shengjin
PROCEEDINGS OF 2018 IEEE 3RD ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC 2018), 2018, : 128 - 132
[28] Do deep convolutional neural networks really need to be deep when applied for remote scene classification?
Luo, Chang
Wang, Jie
Feng, Gang
Xu, Suhui
Wang, Shiqiang
JOURNAL OF APPLIED REMOTE SENSING, 2017, 11
[29] Scene Classification of Remotely Sensed Images via Densely Connected Convolutional Neural Networks and an Ensemble Classifier
Cheng, Qimin
Xu, Yuan
Fu, Peng
Li, Jinling
Wang, Wei
Ren, Yingchao
PHOTOGRAMMETRIC ENGINEERING AND REMOTE SENSING, 2021, 87 (04): : 295 - 308
[30] Convolutional Neural Networks for User Identification Based on Motion Sensors Represented as Images
Benegui, Cezara
Ionescu, Radu Tudor
IEEE ACCESS, 2020, 8 : 61255 - 61266

← 1 2 3 4 5 →