Visual and Textual Deep Feature Fusion for Document Image Classification

被引：29

作者：

Bakkali, Souhail ^{[1
]}

Ming, Zuheng ^{[1
]}

Coustaty, Mickael ^{[1
]}

Rusinol, Marcal ^{[2
]}

机构：

[1] Univ La Rochelle, L3i, La Rochelle, France

[2] Univ Autonoma Barcelona, CVC, Barcelona, Spain

来源：

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020) | 2020年

关键词：

D O I：

10.1109/CVPRW50498.2020.00289

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The topic of text document image classification has been explored extensively over the past few years. Most recent approaches handled this task by jointly learning the visual features of document images and their corresponding textual contents. Due to the various structures of document images, the extraction of semantic information from its textual content is beneficial for document image processing tasks such as document retrieval, information extraction, and text classification. In this work, a two-stream neural architecture is proposed to perform the document image classification task. We conduct an exhaustive investigation of nowadays widely used neural networks as well as word embedding procedures used as backbones, in order to extract both visual and textual features from document images. Moreover, a joint feature learning approach that combines image features and text embeddings is introduced as a late fusion methodology. Both the theoretical analysis and the experimental results demonstrate the superiority of our proposed joint feature learning method comparatively to the single modalities. This joint learning approach outperforms the state-of-the-art results with a classification accuracy of 97.05% on the large-scale RVL-CDIP dataset.

引用

页码：2394 / 2403

页数：10

共 50 条

[1] Multi-Class Document Image Classification using Deep Visual and Textual Features
Sevim, Semih
Ekinci, Ekin
Omurca, Sevinc Ilhan
Edinc, Eren Berk
Eken, Suleyman
Erdem, Turkucan
Sayar, Ahmet
INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2022, 21 (02)
[2] VISUAL AND TEXTUAL FEATURE FUSION FOR AUTOMATIC CUSTOMS TARIFF CLASSIFICATION
Turhan, Bilgehan
Akar, Gozde B.
Turhan, Cigdem
Yuksel, Cihan
2015 IEEE 16TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION, 2015, : 76 - 81
[3] Deep Multiple Feature Fusion for Hyperspectral Image Classification
Cao, Xianghai
Li, Renjie
Wen, Li
Feng, Jie
Jiao, Licheng
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2018, 11 (10) : 3880 - 3891
[4] Hyperspectral Image Classification With Deep Feature Fusion Network
Song, Weiwei
Li, Shutao
Fang, Leyuan
Lu, Ting
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2018, 56 (06): : 3173 - 3184
[5] Improving classification of an industrial document image database by combining visual and textual features
Augereau, Olivier
Journet, Nicholas
Vialard, Anne
Domenger, Jean-Philippe
2014 11TH IAPR INTERNATIONAL WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS (DAS 2014), 2014, : 314 - 318
[6] Deep Belief Networks for Feature Fusion in Hyperspectral Image Classification
Ghassemi, Mohammad
Ghassemian, Hassan
Imani, Maryam
PROCEEDINGSS OF THE 2018 IEEE INTERNATIONAL CONFERENCE ON AEROSPACE ELECTRONICS AND REMOTE SENSING TECHNOLOGY (ICARES 2018), 2018,
[7] A decisive content based image retrieval approach for feature fusion in visual and textual images
Unar, Salahuddin
Wang, Xingyuan
Wang, Chunpeng
Wang, Yu
KNOWLEDGE-BASED SYSTEMS, 2019, 179 : 8 - 20
[8] Visual-Textual Late Semantic Fusion Using Deep Neural Network for Document Categorization
Wang, Cheng
Yang, Haojin
Meinel, Christoph
NEURAL INFORMATION PROCESSING, PT I, 2015, 9489 : 662 - 670
[9] Visual feature coding based on heterogeneous structure fusion for image classification
Lin, Guangfeng
Fan, Caixia
Zhu, Hong
Miu, Yalin
Kang, Xiaobing
INFORMATION FUSION, 2017, 36 : 275 - 283
[10] Deep Feature Extraction and Feature Fusion for Bi-Temporal Satellite Image Classification
Asokan, Anju
Anitha, J.
Patrut, Bogdan
Danciulescu, Dana
Hemanth, D. Jude
CMC-COMPUTERS MATERIALS & CONTINUA, 2021, 66 (01): : 373 - 388

← 1 2 3 4 5 →