MEPAR-miner:: Multi-expression programming for classification rule mining

被引:28
作者
Baykasoglu, Adil [1 ]
Ozbakir, Lale
机构
[1] Gaziantep Univ, Dept Ind Engn, TR-27310 Gaziantep, Turkey
[2] Erciyes Univ, Dept Ind Engn, Kayseri, Turkey
关键词
data mining; classification rules; multi-expression programming; evolutionary programming;
D O I
10.1016/j.ejor.2006.10.015
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
Classification and rule induction are two important tasks to extract knowledge from data. In rule induction, the representation of knowledge is defined as IF-THEN rules which are easily understandable and applicable by problem-domain experts. In this paper, a new chromosome representation and solution technique based on Multi-Expression Programming (MEP) which is named as MEPAR-miner (Multi-Expression Programming for Association Rule Mining) for rule induction is proposed. Multi-Expression Programming (MEP) is a relatively new technique in evolutionary programming that is first introduced in 2002 by Oltean and Dumitrescu. MEP uses linear chromosome structure. In MEP, multiple logical expressions which have different sizes are used to represent different logical rules. MEP expressions can be encoded and implemented in a flexible and efficient manner. MEP is generally applied to prediction problems; in this paper a new algorithm is presented which enables MEP to discover classification rules. The performance of the developed algorithm is tested oil nine publicly available binary and n-ary classification data sets. Extensive experiments are performed to demonstrate that MEPAR-miner can discover effective classification rules that are as good as (or better than) the ones obtained by the traditional rule induction methods. It is also shown that effective gene encoding structure directly improves the predictive accuracy of logical IF-THEN rules. (c) 2006 Elsevier B.V. All rights reserved.
引用
收藏
页码:767 / 784
页数:18
相关论文
共 44 条
  • [1] Combining expert knowledge and data mining in a medical diagnosis domain
    Alonso, F
    Caraça-Valente, JP
    González, AL
    Montes, C
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2002, 23 (04) : 367 - 375
  • [2] [Anonymous], P 2 ANN C GEN PROGR
  • [3] [Anonymous], 1993, P 13 INT JOINT C ART
  • [4] BAYKASOGLU A, 2005, CIE35 35 INT C COMP, P257
  • [5] Genetic programming for knowledge discovery in chest-pain diagnosis
    Bojarczuk, CC
    Lopes, HS
    Freitas, AA
    [J]. IEEE ENGINEERING IN MEDICINE AND BIOLOGY MAGAZINE, 2000, 19 (04): : 38 - 44
  • [6] A constrained-syntax genetic programming system for discovering classification rules: application to medical data sets
    Bojarczuk, CC
    Lopes, HS
    Freitas, AA
    Michalkiewicz, EL
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2004, 30 (01) : 27 - 48
  • [7] A comparison of linear genetic programming and neural networks in medical data mining
    Brameier, M
    Banzhaf, W
    [J]. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2001, 5 (01) : 17 - 26
  • [8] CARVALHO DR, 2002, P 4 INT C REC ADV SO, P260
  • [9] CARVALHO DR, 2002, P GEN EV COMP C GECC, P1035
  • [10] Data mining: An overview from a database perspective
    Chen, MS
    Han, JW
    Yu, PS
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1996, 8 (06) : 866 - 883