Integrating Local CNN and Global CNN for Script Identification in Natural Scene Images

被引:40
作者
Lu, Liqiong [1 ]
Yi, Yaohua [1 ]
Huang, Faliang [2 ]
Wang, Kaili [1 ]
Wang, Qi [3 ]
机构
[1] Wuhan Univ, Sch Printing & Packaging, Wuhan 430072, Hubei, Peoples R China
[2] Fujian Normal Univ, Coll Math & Informat, Fuzhou 350007, Fujian, Peoples R China
[3] Nanjing Forestry Univ, Sch Light Ind & Food Engn, Nanjing 210037, Jiangsu, Peoples R China
关键词
Script identification; Local CNN; Global CNN; ResNet-20; decision-level fusion;
D O I
10.1109/ACCESS.2019.2911964
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Script identification in natural scene images is a key pre-step for text recognition and is also an indispensable condition for automatic text understanding systems that are designed for multilanguage environments. In this paper, we present a novel framework integrating Local CNN and Global CNN both of which are based on ResNet-20 for script identification. We first obtain a lot of patches and segmented images based on the aspect ratios of the images. Subsequently, these patches and segmented images are used as inputs to Local CNN and Global CNN for training, respectively. Finally, to get the final results, the Adaboost algorithm is used to combine the results of Local CNN and Global CNN for decision-level fusion. Bene fiting from such a strategy, Local CNN fully exploits the local features of the image, effectively revealing subtle differences among the scripts that are difficult to distinguish such as English, Greek, and Russian. Moreover, Global CNN mines the global features of the image to improve the accuracy of script identification. The experimental results demonstrate that our approach has a good performance on four public datasets.
引用
收藏
页码:52669 / 52679
页数:11
相关论文
共 30 条
[1]   Script identification in natural scene image and video frames using an attention based Convolutional-LSTM network [J].
Bhunia, Ankan Kumar ;
Konwer, Aishik ;
Bhunia, Ayan Kumar ;
Bhowmick, Abir ;
Roy, Partha P. ;
Pal, Umapada .
PATTERN RECOGNITION, 2019, 85 :172-184
[2]   Word-Level Script Identification from Scene Images [J].
Fasil, O. K. ;
Manjunath, S. ;
Aradhya, V. N. Manjunath .
PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON FRONTIERS IN INTELLIGENT COMPUTING: THEORY AND APPLICATIONS, (FICTA 2016), VOL 2, 2017, 516 :417-426
[3]   Script Recognition-A Review [J].
Ghosh, Debashis ;
Dube, Tulika ;
Shivaprasad, Adamane P. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (12) :2142-2161
[4]  
Gomez L., 2016, 12 IAPR WORKSH DOC A
[5]   Improving patch-based scene text script identification with ensembles of conjoined networks [J].
Gomez, Lluis ;
Nicolaou, Anguelos ;
Karatzas, Dimosthenis .
PATTERN RECOGNITION, 2017, 67 :85-96
[6]  
He K., 2016, IEEE C COMPUT VIS PA, DOI [10.1007/978-3-319-46493-0_38, DOI 10.1007/978-3-319-46493-0_38, DOI 10.1109/CVPR.2016.90]
[7]   Overlapping Community Detection for Multimedia Social Networks [J].
Huang, Faliang ;
Li, Xuelong ;
Zhang, Shichao ;
Zhang, Jilian ;
Chen, Jinhui ;
Zhai, Zhinian .
IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (08) :1881-1893
[8]  
Jaderberg M., 2014, 13 EUR C COMP VIS ZU
[9]   Reading Text in the Wild with Convolutional Neural Networks [J].
Jaderberg, Max ;
Simonyan, Karen ;
Vedaldi, Andrea ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2016, 116 (01) :1-20
[10]  
Jia Y., 2014, Proceedings of the 22nd ACM international conference on Multimedia, DOI [DOI 10.1145/2647868.2654889, 10.1145/2647868.2654889]