THEORETICAL ANALYSES OF CROSS-VALIDATION ERROR AND VOTING IN INSTANCE-BASED LEARNING

被引:5
|
作者
TURNEY, P
机构
[1] Knowledge Systems Laboratory, Institute for Information Technology, National Research Council Canada, Ottawa, ON
关键词
CROSS-VALIDATION; SIMPLICITY; BIAS; VARIANCE; VOTING; INSTANCE-BASED LEARNING;
D O I
10.1080/09528139408953793
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper begins with a general theory of error in cross-validation testing of algorithms for supervised learning from examples. It is assumed that the examples are described by attribute-value pairs, where the values are symbolic. Cross-validation requires a set of training examples and a set of testing examples. The value of the attribute that is to be predicted is known to the learner in the training set, but unknown in the testing set. The theory demonstrates that cross-validation error has two components: error on the training set (inaccuracy) and sensitivity to noise (instability). This general theory is then applied to voting in instance-based learning. Given an example in the testing set, a typical instance-based learning algorithm predicts the designated attribute by voting among the k nearest neighbours (the k most similar examples) to the testing example in the training set. Voting is intended to increase the stability (resistance to noise) of instance-based learning, but a theoretical analysis shows that there are circumstances in which voting can be destabilising. The theory suggests ways to minimize cross-validation error, by insuring that voting is stable and does not adversely affect accuracy.
引用
收藏
页码:331 / 360
页数:30
相关论文
共 50 条
  • [1] A THEORY OF CROSS-VALIDATION ERROR
    TURNEY, P
    JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 1994, 6 (04) : 361 - 391
  • [2] An integrated instance-based learning algorithm
    Wilson, DR
    Martinez, TR
    COMPUTATIONAL INTELLIGENCE, 2000, 16 (01) : 1 - 28
  • [3] Possibilistic instance-based learning
    Hüllermeier, E
    ARTIFICIAL INTELLIGENCE, 2003, 148 (1-2) : 335 - 383
  • [4] Advances in Instance Selection for Instance-Based Learning Algorithms
    Henry Brighton
    Chris Mellish
    Data Mining and Knowledge Discovery, 2002, 6 : 153 - 172
  • [5] Advances in instance selection for instance-based learning algorithms
    Brighton, H
    Mellish, C
    DATA MINING AND KNOWLEDGE DISCOVERY, 2002, 6 (02) : 153 - 172
  • [6] Locally linear reconstruction for instance-based learning
    Kang, Pilsung
    Cho, Sungzoon
    PATTERN RECOGNITION, 2008, 41 (11) : 3507 - 3518
  • [7] Reduction Techniques for Instance-Based Learning Algorithms
    D. Randall Wilson
    Tony R. Martinez
    Machine Learning, 2000, 38 : 257 - 286
  • [8] Reduction techniques for instance-based learning algorithms
    Wilson, DR
    Martinez, TR
    MACHINE LEARNING, 2000, 38 (03) : 257 - 286
  • [9] Instance-based learning in dynamic decision making
    Gonzalez, C
    Lerch, JF
    Lebiere, C
    COGNITIVE SCIENCE, 2003, 27 (04) : 591 - 635
  • [10] Efficient instance-based learning on data streams
    Beringer, Juergen
    Huellermeier, Eyke
    INTELLIGENT DATA ANALYSIS, 2007, 11 (06) : 627 - 650