Combination of loss functions for deep text classification

被引:13
作者
Hajiabadi, Hamideh [1 ]
Molla-Aliod, Diego [2 ]
Monsefi, Reza [1 ]
Yazdi, Hadi Sadoghi [1 ]
机构
[1] FUM, Dept Comp, Mashhad, Razavi Khorasan, Iran
[2] Macquarie Univ, Sydney, NSW 2109, Australia
关键词
Loss Function; Convolutional neural network (CNN); Ensemble method; Multi-class classifier; ENSEMBLE; CORRENTROPY;
D O I
10.1007/s13042-019-00982-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Ensemble methods have shown to improve the results of statistical classifiers by combining multiple single learners into a strong one. In this paper, we explore the use of ensemble methods at the level of the objective function of a deep neural network. We propose a novel objective function that is a linear combination of single losses and integrate the proposed objective function into a deep neural network. By doing so, the weights associated with the linear combination of losses are learned by back propagation during the training stage. We study the impact of such an ensemble loss function on the state-of-the-art convolutional neural networks for text classification. We show the effectiveness of our approach through comprehensive experiments on text classification. The experimental results demonstrate a significant improvement compared with the conventional state-of-the-art methods in the literature.
引用
收藏
页码:751 / 761
页数:11
相关论文
共 43 条
[1]  
[Anonymous], P 3 INT WORKSH DISTR
[2]  
[Anonymous], 2014, C EMPIRICAL METHODS
[3]  
[Anonymous], 2016, ARXIV161008229
[4]  
[Anonymous], ARXIV171105170
[5]  
[Anonymous], 1989, SIAM STUDIES APPL MA
[6]   Convexity, classification, and risk bounds [J].
Bartlett, PL ;
Jordan, MI ;
McAuliffe, JD .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2006, 101 (473) :138-156
[7]   A neural probabilistic language model [J].
Bengio, Y ;
Ducharme, R ;
Vincent, P ;
Jauvin, C .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (06) :1137-1155
[8]  
Biau G, 2008, J MACH LEARN RES, V9, P2015
[9]  
Breiman L., 2001, Mach. Learn., V45, P5
[10]   Generalized Correntropy based deep learning in presence of non-Gaussian noises [J].
Chen, Liangjun ;
Qu, Hua ;
Zhao, Jihong .
NEUROCOMPUTING, 2018, 278 :41-50