Cascade of genetic algorithm and decision tree for cancer classification on gene expression data

被引:0
|
作者
Yeh, Jinn-Yi [1 ]
Wu, Tai-Hsi [2 ]
机构
[1] Natl Chiayi Univ, Dept Management Informat Syst, Chiayi 600, Taiwan
[2] Natl Taipei Univ, Dept Business Adm, Taipei 237, Taiwan
关键词
cancer classification; gene expression data; genetic algorithms; decision tree; SUPPORT VECTOR MACHINES; TUMOR CLASSIFICATION; MICROARRAY DATA; FEATURE-SELECTION; CLUSTER-ANALYSIS; VISUALIZATION; PREDICTION; DISCOVERY; SYSTEMS; CELL;
D O I
10.1111/j.1468-0394.2010.00522.x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cancer classification, through gene expression data analysis, has produced remarkable results, and has indicated that gene expression assays could significantly aid in the development of efficient cancer diagnosis and classification platforms. However, cancer classification, based on DNA array data, remains a difficult problem. The main challenge is the overwhelming number of genes relative to the number of training samples, which implies that there are a large number of irrelevant genes to be dealt with. Another challenge is from the presence of noise inherent in the data set. It makes accurate classification of data more difficult when the sample size is small. We apply genetic algorithms (GAs) with an initial solution provided by t statistics, called t-GA, for selecting a group of relevant genes from cancer microarray data. The decision-tree-based cancer classifier is built on the basis of these selected genes. The performance of this approach is evaluated by comparing it to other gene selection methods using publicly available gene expression data sets. Experimental results indicate that t-GA has the best performance among the different gene selection methods. The Z-score figure also shows that some genes are consistently preferentially chosen by t-GA in each data set.
引用
收藏
页码:201 / 218
页数:18
相关论文
共 50 条
  • [1] Analyzing Gene Expression Data: Fuzzy Decision Tree Algorithm applied to the Classification of Cancer Data
    Ludwig, Simone A.
    Jakobovic, Domagoj
    Picek, Stjepan
    2015 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2015), 2015,
  • [2] A genetic filter for cancer classification on gene expression data
    Kim, Yong-Hyuk
    Yoon, Yourim
    BIO-MEDICAL MATERIALS AND ENGINEERING, 2015, 26 : S1993 - S2002
  • [3] An Integrated Feature Selection Algorithm for Cancer Classification using Gene Expression Data
    Ahmed, Saeed
    Kabir, Muhammad
    Ali, Zakir
    Arif, Muhammad
    Ali, Farman
    Yu, Dong-Jun
    COMBINATORIAL CHEMISTRY & HIGH THROUGHPUT SCREENING, 2018, 21 (09) : 631 - 645
  • [4] A fuzzy decision tree approach to start a genetic algorithm for data classification
    Espíndola, RP
    Ebecken, NFF
    DATA MINING V: DATA MINING, TEXT MINING AND THEIR BUSINESS APPLICATIONS, 2004, 10 : 133 - 142
  • [5] A Hybrid Ensemble Algorithm Combining AdaBoost and Genetic Algorithm for Cancer Classification with Gene Expression Data
    Lu, Huijuan
    Gao, Huiyun
    Ye, Minchao
    Wang, Xiuhui
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2021, 18 (03) : 863 - 870
  • [6] Decision forest for classification of gene expression data
    Huang, Jianping
    Fang, Hong
    Fan, Xiaohui
    COMPUTERS IN BIOLOGY AND MEDICINE, 2010, 40 (08) : 698 - 704
  • [7] Cancer classification based on microarray gene expression data using a principal component accumulation method
    Liu JingJing
    Cai WenSheng
    Shao XueGuang
    SCIENCE CHINA-CHEMISTRY, 2011, 54 (05) : 802 - 811
  • [8] Gene Expression Data Classification by VVRKFA
    Ghorai, Santanu
    Mukherjee, Anirban
    Dutta, Pranab K.
    2ND INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION, CONTROL AND INFORMATION TECHNOLOGY (C3IT-2012), 2012, 4 : 330 - 335
  • [9] Gene expression data classification using genetic algorithm-based feature selection
    Sonmez, Oznur Sinem
    Dagtekin, Mustafa
    Ensari, Tolga
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2021, 29 (07) : 3165 - 3179
  • [10] An ensemble correlation-based gene selection algorithm for cancer classification with gene expression data
    Piao, Yongjun
    Piao, Minghao
    Park, Kiejung
    Ryu, Keun Ho
    BIOINFORMATICS, 2012, 28 (24) : 3306 - 3315