Large-Margin Regularized Softmax Cross-Entropy Loss

被引：32

作者：

Li, Xiaoxu ^{[1
]}

Chang, Dongliang ^{[1
]}

Tian, Tao ^{[1
]}

Cao, Jie ^{[1
]}

机构：

[1] Lanzhou Univ Technol, Sch Comp & Commun, Lanzhou 730050, Gansu, Peoples R China

来源：

IEEE ACCESS | 2019年 / 7卷

基金：

中国国家自然科学基金;

关键词：

Neural networks; cross-entropy loss; large-margin regularization; NEURAL-NETWORKS; DEEP;

D O I：

10.1109/ACCESS.2019.2897692

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Softmax cross-entropy loss with L2 regularization is commonly adopted in the machine learning and neural network community. Considering that the traditional softmax cross-entropy loss simply focuses on fitting or classifying the training data accurately but does not explicitly encourage a large decision margin for classification, some loss functions are proposed to improve the generalization performance by solving the problem. However, these loss functions enhance the difficulty of model optimization. In addition, inspired by regularized logistic regression, where the regularized term is responsible for adjusting the width of decision margin, which can be seen as an approximation of support vector machine, we proposed a large-margin regularization method for softmax cross-entropy loss. The advantages of the proposed loss are twofold as follows: the first is the generalization performance improvement, and the second is easy optimization. The experimental results on three small-sample datasets show that our regularization method achieves good performance and outperforms the existing popular regularization methods of neural networks.

引用

页码：19572 / 19578

页数：7

共 44 条

[11] Evaluating Combinational Illumination Estimation Methods on Real-World Images [J].

Li, Bing ;

Xiong, Weihua ;

Hu, Weiming ;

Funt, Brian .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2014, 23 (03) :1194-1209

[12]

Li DP, 2015, PROC CVPR IEEE, P213, DOI 10.1109/CVPR.2015.7298617

[13] Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories [J].

Li Fei-Fei ;

Fergus, Rob ;

Perona, Pietro .

COMPUTER VISION AND IMAGE UNDERSTANDING, 2007, 106 (01) :59-70

[14]

Li LJ, 2007, LECT NOTES ARTIF INT, V4456, P1

[15] Focal Loss for Dense Object Detection [J].

Lin, Tsung-Yi ;

Goyal, Priya ;

Girshick, Ross ;

He, Kaiming ;

Dollar, Piotr .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (02) :318-327

[16]

Liu W, 2017, ADV SOC SCI EDUC HUM, V99, P212

[17]

Liu WY, 2016, PR MACH LEARN RES, V48

[18] Variational Bayesian Learning for Dirichlet Process Mixture of Inverted Dirichlet Distributions in Non-Gaussian Image Feature Modeling [J].

Ma, Zhanyu ;

Lai, Yuping ;

Kleijn, W. Bastiaan ;

Song, Yi-Zhe ;

Wang, Liang ;

Guo, Jun .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (02) :449-463

[19] Short Utterance Based Speech Language Identification in Intelligent Vehicles With Time-Scale Modifications and Deep Bottleneck Features [J].

Ma, Zhanyu ;

Yu, Hong ;

Chen, Wei ;

Guo, Jun .

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2019, 68 (01) :121-128

[20] Decorrelation of Neutral Vector Variables: Theory and Applications [J].

Ma, Zhanyu ;

Xue, Jing-Hao ;

Leijon, Arne ;

Tan, Zheng-Hua ;

Yang, Zhen ;

Guo, Jun .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (01) :129-143

← 1 2 3 4 5 →