Source-aware Partitioning for Robust Cross-validation

被引:2
作者
Kilinc, Ozsel [1 ]
Uysal, Ismail [1 ]
机构
[1] Univ S Florida, Elect Engn, Tampa, FL 33620 USA
来源
2015 IEEE 14TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA) | 2015年
关键词
machine learning; cross-validation; source-aware; robust learning; ALGORITHM;
D O I
10.1109/ICMLA.2015.216
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
One of the most critical components of engineering a machine learning algorithm for a live application is robust performance assessment prior to its implementation. Cross-validation is used to forecast a specific algorithm's classification or prediction accuracy on new input data given a finite dataset for training and testing the algorithm. Two most well known cross-validation techniques, random subsampling (RSS) and K-fold, are used to generalize the assessment results of machine learning algorithms in a non-exhaustive random manner. In this work we first show that for an inertia based activity recognition problem where data is collected from different users of a wrist-worn wireless accelerometer, random partitioning of the data, regardless of cross-validation technique, results in statistically similar average accuracies for a standard feed-forward neural network classifier. We propose a novel source-aware partitioning technique where samples from specific users are completely left out of the training/validation sets in rotation. The average error for the proposed cross-validation method is significantly higher with lower standard variation, which is a major indicator of cross-validation robustness. Approximately 30% increase in average error rate implies that source-aware cross validation could be a better indication of live algorithm performance where test data statistics would be significantly different than training data due to source (or user)-sensitive nature of process data.
引用
收藏
页码:1083 / 1088
页数:6
相关论文
共 12 条
  • [1] [Anonymous], 2005, P 2005 JOINT C SMART
  • [2] Bengio Y, 2004, J MACH LEARN RES, V5, P1089
  • [3] Bruno B, 2013, IEEE INT CONF ROBOT, P1602, DOI 10.1109/ICRA.2013.6630784
  • [4] Dargie W., 2009, P INT C COMP COMM NE, P3
  • [5] TRAINING FEEDFORWARD NETWORKS WITH THE MARQUARDT ALGORITHM
    HAGAN, MT
    MENHAJ, MB
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (06): : 989 - 993
  • [6] Hastie T., 2009, ELEMENTS STAT LEARNI, DOI DOI 10.1007/978-0-387-84858-7
  • [7] Kilinc O., 2015, 2015 IEEE INT C MACH
  • [8] Kohavi R., 1995, IJCAI-95. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, P1137
  • [9] A SCALED CONJUGATE-GRADIENT ALGORITHM FOR FAST SUPERVISED LEARNING
    MOLLER, MF
    [J]. NEURAL NETWORKS, 1993, 6 (04) : 525 - 533
  • [10] Ng A. Y., 1997, Proceedings of the Fourteenth International Conference on Machine Learning, P245