Analysis of complexity indices for classification problems: Cancer gene expression data

被引:41
|
作者
Lorena, Ana C.
Costa, Ivan G. [1 ]
Spolaor, Newton
de Souto, Marcilio C. P. [1 ]
机构
[1] Univ Fed Pernambuco, Ctr Informat, Recife, PE, Brazil
关键词
Classification; Gene expression data; Complexity indices; Linear separability; BREAST-CANCER; MICROARRAY; SENSITIVITY; PREDICTION; ALGORITHMS; SELECTION; RANKING;
D O I
10.1016/j.neucom.2011.03.054
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Currently, cancer diagnosis at a molecular level has been made possible through the analysis of gene expression data. More specifically, one usually uses machine learning (ML) techniques to build, from cancer gene expression data, automatic diagnosis models (classifiers). Cancer gene expression data often present some characteristics that can have a negative impact in the generalization ability of the classifiers generated. Some of these properties are data sparsity and an unbalanced class distribution. We investigate the results of a set of indices able to extract the intrinsic complexity information from the data. Such measures can be used to analyze, among other things, which particular characteristics of cancer gene expression data mostly impact the prediction ability of support vector machine classifiers. In this context, we also show that, by applying a proper feature selection procedure to the data, one can reduce the influence of those characteristics in the error rates of the classifiers induced. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:33 / 42
页数:10
相关论文
共 50 条
  • [21] Marker identification and classification of cancer types using gene expression data and SIMCA
    Bicciato, S
    Luchini, A
    Di Bello, C
    METHODS OF INFORMATION IN MEDICINE, 2004, 43 (01) : 4 - 8
  • [22] Feasible analysis of gene expression -a computational based classification for breast cancer
    Nandagopal, V
    Geeitha, S.
    Kumar, K. Vinoth
    Anbarasi, J.
    MEASUREMENT, 2019, 140 : 120 - 125
  • [23] Analyzing Gene Expression Data: Fuzzy Decision Tree Algorithm applied to the Classification of Cancer Data
    Ludwig, Simone A.
    Jakobovic, Domagoj
    Picek, Stjepan
    2015 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2015), 2015,
  • [24] Gene expression based cancer classification
    Tarek, Sara
    Abd Elwahab, Reda
    Shoman, Mahmoud
    EGYPTIAN INFORMATICS JOURNAL, 2017, 18 (03) : 151 - 159
  • [25] Biomimetic Pattern Recognition Method for Breast Cancer Using Gene Expression Data
    Yang, Xiao Li
    Yang, Si Ya
    He, Qiong
    Zhao, Hong Yan
    MATERIAL SCIENCES AND TECHNOLOGY, PTS 1 & 2, 2012, 560-561 : 401 - 409
  • [26] On the classification of microarray gene-expression data
    Basford, Kaye E.
    McLachlan, Geoffrey J.
    Rathnayake, Suren I.
    BRIEFINGS IN BIOINFORMATICS, 2013, 14 (04) : 402 - 410
  • [27] Gene-Network-Based Feature Set (GNFS) for Expression-Based Cancer Classification
    Doungpan, Narumol
    Engchuan, Worrawat
    Meechai, Asawin
    Fong, Simon
    Chan, Jonathan H.
    JOURNAL OF MEDICAL IMAGING AND HEALTH INFORMATICS, 2016, 6 (04) : 1093 - 1101
  • [28] Similarity-balanced discriminant neighbor embedding and its application to cancer classification based on gene expression data
    Zhang, Li
    Qian, Liqiang
    Ding, Chuntao
    Zhou, Weida
    Li, Fanzhang
    COMPUTERS IN BIOLOGY AND MEDICINE, 2015, 64 : 236 - 245
  • [29] Gene expression data classification using locally linear discriminant embedding
    Li, Bo
    Zheng, Chun-Hou
    Huang, De-Shuang
    Zhang, Lei
    Han, Kyungsook
    COMPUTERS IN BIOLOGY AND MEDICINE, 2010, 40 (10) : 802 - 810
  • [30] An effective classification model for cancer diagnosis using micro array Gene expression data
    Saravanan, V.
    Mallika, R.
    2009 INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND TECHNOLOGY, VOL I, PROCEEDINGS, 2009, : 137 - +