Using genetic algorithms to optimize nearest neighbors for data mining

被引:17
作者
Ahn, Hyunchul [2 ]
Kim, Kyoung-jae [1 ]
机构
[1] Dongguk Univ, Dept Management Informat Syst, Seoul 100715, South Korea
[2] Sungshin Womens Univ, Dept Business Adm, Coll Social Sci, Seoul 136742, South Korea
关键词
case-based reasoning; genetic algorithms; number of neighbors to combine; stock market prediction; purchase prediction;
D O I
10.1007/s10479-008-0325-2
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
Case-based reasoning (CBR) is widely used in data mining for managerial applications because it often shows significant promise for improving the effectiveness of complex and unstructured decision making. There are, however, some limitations in designing appropriate case indexing and retrieval mechanisms including feature selection and feature weighting. Some of the prior studies pointed out that finding the optimal k parameter for the k-nearest neighbor (k-NN) is also one of the most important factors for designing an effective CBR system. Nonetheless, there have been few attempts to optimize the number of neighbors, especially using artificial intelligence (AI) techniques. This study proposes a genetic algorithm (GA) approach to optimize the number of neighbors to combine. In this study, we apply this novel model to two real-world cases involving stock market and online purchase prediction problems. Experimental results show that a GA-optimized k-NN approach may outperform traditional k-NN. In addition, these results also show that our proposed method is as good as or sometime better than other AI techniques in performance-comparison.
引用
收藏
页码:5 / 18
页数:14
相关论文
共 23 条
[11]  
Han J, 2001, DATAMINING CONCEPTS
[12]   Self-optimising CBR retrieval [J].
Jarmulak, J ;
Craw, S ;
Rowe, R .
12TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2000, :376-383
[13]   Toward global optimization of case-based reasoning systems for financial forecasting [J].
Kim, KJ .
APPLIED INTELLIGENCE, 2004, 21 (03) :239-249
[14]   Maintaining case-based reasoning systems using a genetic algorithms approach [J].
Kim, KJ ;
Han, I .
EXPERT SYSTEMS WITH APPLICATIONS, 2001, 21 (03) :139-145
[15]  
Kolodner J, 1993, CASE BASED REASONING
[16]   Nearest neighbor classifier: Simultaneous editing and feature selection [J].
Kuncheva, LI ;
Jain, LC .
PATTERN RECOGNITION LETTERS, 1999, 20 (11-13) :1149-1156
[17]  
Rozsypal A., 2003, Intelligent Data Analysis, V7, P291
[18]   Case-based reasoning supported by genetic algorithms for corporate bond rating [J].
Shin, KS ;
Han, I .
EXPERT SYSTEMS WITH APPLICATIONS, 1999, 16 (02) :85-95
[19]   A genetic algorithm application in bankruptcy prediction modeling [J].
Shin, KS ;
Lee, YJ .
EXPERT SYSTEMS WITH APPLICATIONS, 2002, 23 (03) :321-328
[20]   A NOTE ON GENETIC ALGORITHMS FOR LARGE-SCALE FEATURE-SELECTION [J].
SIEDLECKI, W ;
SKLANSKY, J .
PATTERN RECOGNITION LETTERS, 1989, 10 (05) :335-347