THEORETICAL ANALYSES OF CROSS-VALIDATION ERROR AND VOTING IN INSTANCE-BASED LEARNING

被引:5
|
作者
TURNEY, P
机构
[1] Knowledge Systems Laboratory, Institute for Information Technology, National Research Council Canada, Ottawa, ON
关键词
CROSS-VALIDATION; SIMPLICITY; BIAS; VARIANCE; VOTING; INSTANCE-BASED LEARNING;
D O I
10.1080/09528139408953793
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper begins with a general theory of error in cross-validation testing of algorithms for supervised learning from examples. It is assumed that the examples are described by attribute-value pairs, where the values are symbolic. Cross-validation requires a set of training examples and a set of testing examples. The value of the attribute that is to be predicted is known to the learner in the training set, but unknown in the testing set. The theory demonstrates that cross-validation error has two components: error on the training set (inaccuracy) and sensitivity to noise (instability). This general theory is then applied to voting in instance-based learning. Given an example in the testing set, a typical instance-based learning algorithm predicts the designated attribute by voting among the k nearest neighbours (the k most similar examples) to the testing example in the training set. Voting is intended to increase the stability (resistance to noise) of instance-based learning, but a theoretical analysis shows that there are circumstances in which voting can be destabilising. The theory suggests ways to minimize cross-validation error, by insuring that voting is stable and does not adversely affect accuracy.
引用
收藏
页码:331 / 360
页数:30
相关论文
共 50 条
  • [11] An optimization algorithm based on active and instance-based learning
    Fuentes, O
    Solorio, T
    MICAI 2004: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2004, 2972 : 242 - 251
  • [12] Extracting Web Data Using Instance-Based Learning
    Yanhong Zhai
    Bing Liu
    World Wide Web, 2007, 10 : 113 - 132
  • [13] Natural Neighbor Reduction Algorithm for Instance-based Learning
    Yang, Lijun
    Zhu, Qingsheng
    Huang, Jinlong
    Cheng, Dongdong
    Zhang, Cheng
    INTERNATIONAL JOURNAL OF COGNITIVE INFORMATICS AND NATURAL INTELLIGENCE, 2016, 10 (04) : 59 - 73
  • [14] Predicting Remaining Useful Life Based on Instance-based Learning
    Shahraki, Ameneh Forouzandeh
    Roy, Arighna
    Yadav, Om Prakash
    Rathore, Ajay Pal Singh
    2019 ANNUAL RELIABILITY AND MAINTAINABILITY SYMPOSIUM (RAMS 2019) - R & M IN THE SECOND MACHINE AGE - THE CHALLENGE OF CYBER PHYSICAL SYSTEMS, 2019,
  • [15] Weighted instance-based learning using representative intervals
    Gomez, Octavio
    Morales, Eduardo F.
    Gonzalez, Jesus A.
    MICAI 2007: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2007, 4827 : 420 - +
  • [16] An instance-based learning approach based on grey relational structure
    Huang, Chi-Chun
    Lee, Hahn-Ming
    APPLIED INTELLIGENCE, 2006, 25 (03) : 243 - 251
  • [17] On the Impact of Distance Metrics in Instance-Based Learning Algorithms
    Lopes, Noel
    Ribeiro, Bernardete
    PATTERN RECOGNITION AND IMAGE ANALYSIS (IBPRIA 2015), 2015, 9117 : 48 - 56
  • [18] Effects of domain characteristics on instance-based learning algorithms
    Okamoto, S
    Yugami, N
    THEORETICAL COMPUTER SCIENCE, 2003, 298 (01) : 207 - 233
  • [19] Estimating misclassification error: a closer look at cross-validation based methods
    Songthip Ounpraseuth
    Shelly Y Lensing
    Horace J Spencer
    Ralph L Kodell
    BMC Research Notes, 5 (1)
  • [20] Extracting Web data using instance-based learning
    Zhai, Yanhong
    Liu, Bing
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2007, 10 (02): : 113 - 132