Deep-CNNTL: Text Localization from Natural Scene Images Using Deep Convolution Neural Network with Transfer Learning

被引:0
作者
Y. L. Chaitra
R. Dinesh
M. T. Gopalakrishna
B. V. Ajay Prakash
机构
[1] Jain University,SJB Institute of Technology
[2] Affiliated to Visvesvaraya Technological University,undefined
来源
Arabian Journal for Science and Engineering | 2022年 / 47卷
关键词
Text localization; Deep learning; Transfer Learning; Scene Images; VGG16 architecture;
D O I
暂无
中图分类号
学科分类号
摘要
Text localization from natural images plays an essential role in reading the text content present in the illustration. It is complex to localize the textual content because the text in natural scene images will be scattered. Prior information about the location of the text, size of the text, the orientation of the text, and the number of text present in the images are not available. These factors have posed a challenge to localize text in natural scene images. We have proposed a comprehensive solution for localizing text using Deep Convolution Neural Network (DCNN) and Transfer Learning (TL). DCNN layers such as convolution, dense layers, dropout, and learning rate are optimized using a random search. A combination of DCNN+TL is more effective in processing complex text images using VGG16 architecture. The proposed method has experimented on the standard ICDAR 2015 dataset, and the obtained results proved to be more effective with accuracy and an F-score of 0.8279 compared to state-of-art methods.
引用
收藏
页码:9629 / 9640
页数:11
相关论文
共 31 条
[1]  
Narang SR(2020)Ancient text recognition: a review Artif. Intell. Rev. 53 5517-5558
[2]  
Jindal MK(2015)Multiorientation scene text detection with adaptive clustering IEEE Trans. PAMI 37 1930-1937
[3]  
Kumar M(2021)Robust detection of video text using an efficient hybrid method via key frame extraction and text localization Multimed. Tools Appl. 80 9671-9686
[4]  
Yin XC(2014)Robust text detection in natural scene images IEEE Trans. PAMI 36 970-983
[5]  
Pei WY(1989)Gabor filters as texture discriminator Biol. Cybern. 61 103-113
[6]  
Zhang J(1989)A theory for multiresolution signal decomposition: the wavelet representation IEEE Trans. PAMI 11 674-693
[7]  
Hao HW(2004)Text information extraction in images and video: a survey Pattern Recog. 5 977-997
[8]  
Sravani M(2016)Reading text in the wild with convolutional neural networks Int. J. Comput. Vis. 116 1-20
[9]  
Maheswararao A(1996)Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes J. Clin. Epidemiol. 49 1225-31
[10]  
Murthy MK(2012)Random search for hyperparameter optimization The J. Mach. Learn. Res. 13 281-305