Risk-sensitive loss functions for sparse multi-category classification problems

被引：92

作者：

Suresh, S. ^{[1
]}

Sundararajan, N. ^{[1
]}

Saratchandran, P. ^{[1
]}

机构：

[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 638768, Singapore

来源：

INFORMATION SCIENCES | 2008年 / 178卷 / 12期

关键词：

multi-category classification; neural network; risk-sensitive loss function; cross-entropy; satellite imaging; micro-array gene expression;

D O I：

10.1016/j.ins.2008.02.009

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper, we propose two risk-sensitive loss functions to solve the multi-category classification problems where the number of training samples is small and/or there is a high imbalance in the number of samples per class. Such problems are common in the bio-informatics/medical diagnosis areas. The most commonly used loss functions in the literature do not perform well in these problems as they minimize only the approximation error and neglect the estimation error due to imbalance in the training set. The proposed risk-sensitive loss functions minimize both the approximation and estimation error. We present an error analysis for the risk-sensitive loss functions along with other well known loss functions. Using a neural architecture, classifiers incorporating these risk-sensitive loss functions have been developed and their performance evaluated for two real world multi-class classification problems, viz., a satellite image classification problem and a microarray gene expression based cancer classification problem. To study the effectiveness of the proposed loss functions, we have deliberately imbalanced the training samples in the satellite image problem and compared the performance of our neural classifiers with those developed using other well-known loss functions. The results indicate the superior performance of the neural classifier using the proposed loss functions both in terms of the overall and. per class classification accuracy. Performance comparisons have also been carried out on a number of benchmark problems where the data is normal i.e., not sparse or imbalanced. Results indicate similar or better performance of the proposed loss functions compared to the well-known loss functions. Published by Elsevier Inc.

引用

页码：2621 / 2638

页数：18

共 31 条

[1] Allwein E. L., 2000, J MACHINE LEARNING R, V1, P113, DOI DOI 10.1162/15324430152733133
[2] Asuncion A., 2007, UCI MACHINE LEARNING
[3] An empirical comparison of voting classification algorithms: Bagging, boosting, and variants
Bauer, E
Kohavi, R
[J]. MACHINE LEARNING, 1999, 36 (1-2) : 105 - 139
[4] Knowledge-based analysis of microarray gene expression data by using support vector machines
Brown, MPS
Grundy, WN
Lin, D
Cristianini, N
Sugnet, CW
Furey, TS
Ares, M
Haussler, D
[J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (01) : 262 - 267
[5] Christmann A, 2004, J MACH LEARN RES, V5, P1007
[6] Cichocki A., 1993, Neural Networks for Optimization and Signal Processing
[7] Cost functions to estimate a posteriori probabilities in multiclass problems
Cid-Sueiro, J
Arribas, JI
Urbán-Muñoz, S
Figueiras-Vidal, AR
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1999, 10 (03): : 645 - 656
[8] Cristianini N., 2000, Intelligent Data Analysis: An Introduction
[9] A decision-theoretic generalization of on-line learning and an application to boosting
Freund, Y
Schapire, RE
[J]. JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 55 (01) : 119 - 139
[10] NEURAL NETWORKS AND THE BIAS VARIANCE DILEMMA
GEMAN, S
BIENENSTOCK, E
DOURSAT, R
[J]. NEURAL COMPUTATION, 1992, 4 (01) : 1 - 58

← 1 2 3 4 →