An effective hierarchical extreme learning machine based multimodal fusion framework

被引：13

作者：

Du, Fang ^{[1
]}

Zhang, Jiangshe ^{[1
]}

Ji, Nannan ^{[1
]}

Shi, Guang ^{[1
]}

Zhang, Chunxia ^{[1
]}

机构：

[1] Xi An Jiao Tong Univ, Sch Math & Stat, 28 Xianning West Rd, Xian, Shaanxi, Peoples R China

来源：

NEUROCOMPUTING | 2018年 / 322卷

基金：

中国国家自然科学基金;

关键词：

Multimodal learning; Multimodal fusion; Extreme learning machine; Deep learning;

D O I：

10.1016/j.neucom.2018.09.005

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep learning has been successfully applied to multimodal representation learning. Similar with single modal deep learning method, such multimodal deep learning methods consist of a greedy layer-wise feedforward propagation and a backpropagation (BP) fine-tune conducted by diverse targets. These models have the drawback of time consuming. While, extreme learning machine (ELM) is a fast learning algorithm for single hidden layer feedforward neural network. And previous works has shown the effectiveness of ELM based hierarchical framework for multilayer perceptron. In this paper, we introduce an ELM based hierarchical framework for multimodal data. The proposed architecture consists of three main components: (1) self-taught feature extraction for specific modality by an ELM-based sparse autoencoder, (2) fused representation learning based on the features learned by previous step and (3) supervised feature classification based on the fused representation. This is an exact feedforward framework that once a layer is established, its weights are fixed without fine-tuning. Therefore, it has much better learning efficiency than the gradient based multimodal deep learning methods. We conduct experiments on MNIST, XRMB and NUS datasets, the proposed algorithm obtains faster convergence and achieves better classification performance compared with the other existing multimodal deep learning models. (C) 2018 Elsevier B. V. All rights reserved.

引用

页码：141 / 150

页数：10

共 33 条

[1]

Akaho S., 2007, INT M PSYCH SOC IMPS, V40, P263

[2]

Andrew G., 2013, P INT C INT C MACH L

[3]

[Anonymous], 2011, AIStats

[4]

[Anonymous], 2014, Advances in neural information processing systems

[5]

[Anonymous], 2005, A Probabilistic Interpretation of Canonical Correlation Analysis

[6]

[Anonymous], 2009, P ACM INT C IMAGE VI

[7] A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems [J].

Beck, Amir ;

Teboulle, Marc .

SIAM JOURNAL ON IMAGING SCIENCES, 2009, 2 (01) :183-202

[8]

Benmokhtar R., 2014, ONTOLOGY BASED EVIDE

[9]

Chen N., 2010, Advances in neural information processing systems, P361

[10]

Chetty G., 2006, P 2005 NICTA HCSNET, V57, P17

← 1 2 3 4 →