THEORETICAL ANALYSES OF CROSS-VALIDATION ERROR AND VOTING IN INSTANCE-BASED LEARNING

被引:5
作者
TURNEY, P
机构
[1] Knowledge Systems Laboratory, Institute for Information Technology, National Research Council Canada, Ottawa, ON
关键词
CROSS-VALIDATION; SIMPLICITY; BIAS; VARIANCE; VOTING; INSTANCE-BASED LEARNING;
D O I
10.1080/09528139408953793
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper begins with a general theory of error in cross-validation testing of algorithms for supervised learning from examples. It is assumed that the examples are described by attribute-value pairs, where the values are symbolic. Cross-validation requires a set of training examples and a set of testing examples. The value of the attribute that is to be predicted is known to the learner in the training set, but unknown in the testing set. The theory demonstrates that cross-validation error has two components: error on the training set (inaccuracy) and sensitivity to noise (instability). This general theory is then applied to voting in instance-based learning. Given an example in the testing set, a typical instance-based learning algorithm predicts the designated attribute by voting among the k nearest neighbours (the k most similar examples) to the testing example in the training set. Voting is intended to increase the stability (resistance to noise) of instance-based learning, but a theoretical analysis shows that there are circumstances in which voting can be destabilising. The theory suggests ways to minimize cross-validation error, by insuring that voting is stable and does not adversely affect accuracy.
引用
收藏
页码:331 / 360
页数:30
相关论文
共 50 条
  • [41] Refinement operators for directed labeled graphs with applications to instance-based learning
    Ontanon, Santiago
    Shokoufandeh, Ali
    KNOWLEDGE-BASED SYSTEMS, 2018, 161 : 425 - 441
  • [42] Instance-based learning using the half-space proximal graph
    Talamantes, Ariana
    Chavez, Edgar
    PATTERN RECOGNITION LETTERS, 2022, 156 : 88 - 95
  • [43] Comparison of instance-based techniques for learning to predict changes in stock prices
    LeRoux, David B.
    Computational Intelligence in Economics and Finance, Vol II, 2007, : 135 - 143
  • [44] Validating instance-based learning mechanisms outside of ACT-R
    Gonzalez, Cleotilde
    Dutt, Varun
    Lebiere, Christian
    JOURNAL OF COMPUTATIONAL SCIENCE, 2013, 4 (04) : 262 - 268
  • [45] Scale development based on likelihood cross-validation
    Knafl, George J.
    Dixon, Jane K.
    O'Malley, Jean P.
    Grey, Margaret
    Deatrick, Janet A.
    Gallo, Agatha
    Knafl, Kathleen A.
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2012, 21 (06) : 599 - 619
  • [46] RIONA:: A new classification system combining rule induction and instance-based learning
    Góra, G
    Wojna, A
    FUNDAMENTA INFORMATICAE, 2002, 51 (04) : 369 - 390
  • [47] Confidence intervals for the Cox model test error from cross-validation
    Sun, Min Woo
    Tibshirani, Robert
    STATISTICS IN MEDICINE, 2023, 42 (25) : 4532 - 4541
  • [48] An approach to model building for accelerated cooling process using instance-based learning
    Zheng, Yi
    Li, Shaoyuan
    Wang, Xiaobo
    EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (07) : 5364 - 5371
  • [49] Real-time recognition of personal routes using instance-based learning
    Mazhelis, Oleksiy
    2011 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2011, : 619 - 624
  • [50] Image segmentation using automatic seeded region growing and instance-based learning
    Gomez, Octavio
    Gonzalez, Jesus A.
    Morales, Eduardo F.
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS AND APPLICATIONS, PROCEEDINGS, 2007, 4756 : 192 - 201