A constrained-syntax genetic programming system for discovering classification rules: application to medical data sets

被引:78
作者
Bojarczuk, CC
Lopes, HS
Freitas, AA
Michalkiewicz, EL
机构
[1] Ctr Fed Educ Tecnol Parana, CEFETPR, CPGEI, Lab Bioinformat, BR-80230901 Curitiba, Parana, Brazil
[2] Univ Kent, Comp Lab, Canterbury CT2 7NF, Kent, England
[3] Hosp Erasto Gaertner, Setor Cirurgia Pediat, BR-81520060 Curitiba, Parana, Brazil
关键词
constrained-syntax genetic programming; data mining; classification rules;
D O I
10.1016/j.artmed.2003.06.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a new constrained-syntax genetic programming (GP) algorithm for discovering classification rules in medical data sets. The proposed GP contains several syntactic constraints to be enforced by the system using a disjunctive normal form representation, so that individuals represent valid rule sets that are easy to interpret. The GP is compared with C4.5, a well-known decision-tree-building algorithm, and with another GP that uses Boolean inputs (BGP), in five medical data sets: chest pain, Ljubljana breast cancer, dermatology, Wisconsin breast cancer, and pediatric adrenocortical tumor. For this last data set a new preprocessing step was devised for survival prediction. Computational experiments show that, overall, the GP algorithm obtained good results with respect to predictive accuracy and rule comprehensibility, by comparison with C4.5 and BGP. (C) 2004 Elsevier B.V. All rights reserved.
引用
收藏
页码:27 / 48
页数:22
相关论文
共 31 条
[1]  
[Anonymous], 1998, Genetic programming: an introduction
[2]  
[Anonymous], SYMBOLIC VISUAL LEAR
[3]  
[Anonymous], 1994, MACHINE LEARNING NEU
[4]   Genetic programming for knowledge discovery in chest-pain diagnosis [J].
Bojarczuk, CC ;
Lopes, HS ;
Freitas, AA .
IEEE ENGINEERING IN MEDICINE AND BIOLOGY MAGAZINE, 2000, 19 (04) :38-44
[5]  
BOJARCZUK CC, 2003, P 6 EUR C GEN PROGR, P11
[6]  
CLACK C, 1998, P 3 ANN C GEN PROGR, P416
[7]   Discovering interesting patterns for investment decision making with GLOWER - A genetic learner overlaid with entropy reduction [J].
Dhar, V ;
Chou, D ;
Provost, F .
DATA MINING AND KNOWLEDGE DISCOVERY, 2000, 4 (04) :251-280
[8]  
Fayyad U. M., 1996, ADV KNOWLEDGE DISCOV, P1, DOI DOI 10.1609/AIMAG.V17I3.1230
[9]  
FREEMAN JJ, 1998, P 3 ANN C GEN PROGR, P72
[10]  
Freitas A., 2003, Adv. Evol. Comput. Theory Appl., V01, P819, DOI DOI 10.5555/903758.903792