Learning Document Image Features With SqueezeNet Convolutional Neural Network

被引:16
|
作者
Hassanpour, M. [1 ]
Malek, H. [1 ]
机构
[1] Shahid Beheshti Univ, Dept Comp Sci Engn, Tehran, Iran
来源
INTERNATIONAL JOURNAL OF ENGINEERING | 2020年 / 33卷 / 07期
关键词
Squeezenet; Convolutional Neural; Network; Document Image; Classification;
D O I
10.5829/ije.2020.33.07a.05
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
The classification of various document image classes is considered an important step towards building a modern digital library or office automation system. Convolutional Neural Network (CNN) classifiers trained with backpropagation are considered to be the current state of the art model for this task. However, there are two major drawbacks for these classifiers: the huge computational power demand for training, and their very large number of weights. Previous successful attempts at learning document image features have been based on training very large CNNs. SqueezeNet is a CNN architecture that achieves accuracies comparable to other state of the art CNNs while containing up to 50 times less weights, but never before experimented on document image classification tasks. In this research we have taken a novel approach towards learning these document image features by training on a very small CNN network such as SqueezeNet. We show that an ImageNet pretrained SqueezeNet achieves an accuracy of approximately 75 percent over 10 classes on the Tobacco-3482 dataset, which is comparable to other state of the art CNN. We then visualize saliency maps of the gradient of our trained SqueezeNet's output to input, which shows that the network is able to learn meaningful features that are useful for document classification. Previous works in this field have made no emphasis on visualizing the learned document features. The importance of features such as the existence of handwritten text, document titles, text alignment and tabular structures in the extracted saliency maps, proves that the network does not overfit to redundant representations of the rather small Tobacco-3482 dataset, which contains only 3482 document images over 10 classes.
引用
收藏
页码:1201 / 1207
页数:7
相关论文
共 50 条
  • [21] Efficient document-image super-resolution using convolutional neural network
    Ram Krishna Pandey
    A G Ramakrishnan
    Sādhanā, 2018, 43
  • [22] Efficient document-image super-resolution using convolutional neural network
    Pandey, Ram Krishna
    Ramakrishnan, A. G.
    SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2018, 43 (02):
  • [23] Learning Salient Features for Flower Classification Using Convolutional Neural Network
    Hu, Fei
    Yao, Fuguang
    Pu, Changjiu
    PROCEEDINGS OF 2020 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INFORMATION SYSTEMS (ICAIIS), 2020, : 476 - 479
  • [24] Transfer learning for Hyperspectral image classification using convolutional neural network
    Liu, Yao
    Xiao, Chenchao
    MIPPR 2019: REMOTE SENSING IMAGE PROCESSING, GEOGRAPHIC INFORMATION SYSTEMS, AND OTHER APPLICATIONS, 2020, 11432
  • [25] Image retrieval method based on metric learning for convolutional neural network
    Wang, Jieyuan
    Qian, Ying
    Ye, Qingqing
    Wang, Biao
    2017 2ND INTERNATIONAL SEMINAR ON ADVANCES IN MATERIALS SCIENCE AND ENGINEERING, 2017, 231
  • [26] Learning to Answer Questions from Image Using Convolutional Neural Network
    Ma, Lin
    Lu, Zhengdong
    Li, Hang
    THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 3567 - 3573
  • [27] Convolutional neural network based on an extreme learning machine for image classification
    Park, Youngmin
    Yang, Hyun S.
    NEUROCOMPUTING, 2019, 339 : 66 - 76
  • [28] Sign Language Learning System with Image Sampling and Convolutional Neural Network
    Ji, Yangho
    Kim, Sunmok
    Lee, Ki-Baek
    2017 FIRST IEEE INTERNATIONAL CONFERENCE ON ROBOTIC COMPUTING (IRC), 2017, : 371 - 375
  • [29] LEARNING AND TRANSFERRING REPRESENTATIONS FOR IMAGE STEGANALYSIS USING CONVOLUTIONAL NEURAL NETWORK
    Qian, Yinlong
    Dong, Jing
    Wang, Wei
    Tan, Tieniu
    2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 2752 - 2756
  • [30] Weather Image Recognition Based on Convolutional Neural Network and Transfer Learning
    Gao, Zunhai
    Qiu, Yuzhan
    PROCEEDINGS OF 2024 3RD INTERNATIONAL CONFERENCE ON CRYPTOGRAPHY, NETWORK SECURITY AND COMMUNICATION TECHNOLOGY, CNSCT 2024, 2024, : 631 - 638