Source-aware Partitioning for Robust Cross-validation

被引：2

作者：

Kilinc, Ozsel ^{[1
]}

Uysal, Ismail ^{[1
]}

机构：

[1] Univ S Florida, Elect Engn, Tampa, FL 33620 USA

来源：

2015 IEEE 14TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA) | 2015年

关键词：

machine learning; cross-validation; source-aware; robust learning; ALGORITHM;

D O I：

10.1109/ICMLA.2015.216

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

One of the most critical components of engineering a machine learning algorithm for a live application is robust performance assessment prior to its implementation. Cross-validation is used to forecast a specific algorithm's classification or prediction accuracy on new input data given a finite dataset for training and testing the algorithm. Two most well known cross-validation techniques, random subsampling (RSS) and K-fold, are used to generalize the assessment results of machine learning algorithms in a non-exhaustive random manner. In this work we first show that for an inertia based activity recognition problem where data is collected from different users of a wrist-worn wireless accelerometer, random partitioning of the data, regardless of cross-validation technique, results in statistically similar average accuracies for a standard feed-forward neural network classifier. We propose a novel source-aware partitioning technique where samples from specific users are completely left out of the training/validation sets in rotation. The average error for the proposed cross-validation method is significantly higher with lower standard variation, which is a major indicator of cross-validation robustness. Approximately 30% increase in average error rate implies that source-aware cross validation could be a better indication of live algorithm performance where test data statistics would be significantly different than training data due to source (or user)-sensitive nature of process data.

引用

页码：1083 / 1088

页数：6

共 12 条

[1] [Anonymous], 2005, P 2005 JOINT C SMART
[2] Bengio Y, 2004, J MACH LEARN RES, V5, P1089
[3] Bruno B, 2013, IEEE INT CONF ROBOT, P1602, DOI 10.1109/ICRA.2013.6630784
[4] Dargie W., 2009, P INT C COMP COMM NE, P3
[5] TRAINING FEEDFORWARD NETWORKS WITH THE MARQUARDT ALGORITHM
HAGAN, MT
MENHAJ, MB
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (06): : 989 - 993
[6] Hastie T., 2009, ELEMENTS STAT LEARNI, DOI DOI 10.1007/978-0-387-84858-7
[7] Kilinc O., 2015, 2015 IEEE INT C MACH
[8] Kohavi R., 1995, IJCAI-95. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, P1137
[9] A SCALED CONJUGATE-GRADIENT ALGORITHM FOR FAST SUPERVISED LEARNING
MOLLER, MF
[J]. NEURAL NETWORKS, 1993, 6 (04) : 525 - 533
[10] Ng A. Y., 1997, Proceedings of the Fourteenth International Conference on Machine Learning, P245

← 1 2 →