A kernel-based clustering method for gene selection with gene expression data

被引:49
作者
Chen, Huihui [1 ]
Zhang, Yusen [1 ]
Gutman, Ivan [2 ]
机构
[1] Shandong Univ Weihai, Sch Math & Stat, Weihai 264209, Peoples R China
[2] Univ Kragujevac, Fac Sci, POB 60, Kragujevac 34000, Serbia
关键词
Gene expression data; Kernel-based clustering; Adaptive distance; Gene selection; Cancer classification; CANCER CLASSIFICATION; PREDICTION; ALGORITHM; DISCOVERY;
D O I
10.1016/j.jbi.2016.05.007
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Gene selection is important for cancer classification based on gene expression data, because of high dimensionality and small sample size. In this paper, we present a new gene selection method based on clustering, in which dissimilarity measures are obtained through kernel functions. It searches for best weights of genes iteratively at the same time to optimize the clustering objective function. Adaptive distance is used in the process, which is suitable to learn the weights of genes during the clustering process, improving the performance of the algorithm. The proposed algorithm is simple and does not require any modification or parameter optimization for each dataset. We tested it on eight publicly available datasets, using two classifiers (support vector machine, k-nearest neighbor), compared with other six competitive feature selectors. The results show that the proposed algorithm is capable of achieving better accuracies and may be an efficient tool for finding possible biomarkers from gene expression data. (C) 2016 Elsevier Inc. All rights reserved.
引用
收藏
页码:12 / 20
页数:9
相关论文
共 50 条
  • [41] Random Forest for Gene Selection and Microarray Data Classification
    Moorthy, Kohbalan
    Mohamad, Mohd Saberi
    KNOWLEDGE TECHNOLOGY, 2012, 295 : 174 - 183
  • [42] Ensemble gene selection by grouping for microarray data classification
    Liu, Huawen
    Liu, Lei
    Zhang, Huijie
    JOURNAL OF BIOMEDICAL INFORMATICS, 2010, 43 (01) : 81 - 87
  • [43] Assisted gene expression-based clustering with AWNCut
    Li, Yang
    Bie, Ruofan
    Hidalgo, Sebastian J. Teran
    Qin, Yichen
    Wu, Mengyun
    Ma, Shuangge
    STATISTICS IN MEDICINE, 2018, 37 (29) : 4386 - 4403
  • [44] Performance Assessment of Kernel-Based Clustering
    Tushir, Meena
    Srivastava, Smriti
    COMPUTATIONAL INTELLIGENCE, CYBER SECURITY AND COMPUTATIONAL MODELS, 2014, 246 : 139 - 145
  • [45] A hybrid gene selection method based on gene scoring strategy and improved particle swarm optimization
    Han, Fei
    Tang, Di
    Sun, Yu-Wen-Tian
    Cheng, Zhun
    Jiang, Jing
    Li, Qiu-Wei
    BMC BIOINFORMATICS, 2019, 20 (Suppl 8)
  • [46] A New Filter Approach Based on Effective Ranges for Classification of Gene Expression Data
    Turfan, Derya
    Altunkaynak, Bulent
    Yeniay, Ozgur
    BIG DATA, 2024, 12 (04) : 312 - 330
  • [47] Gene expression data clustering based on graph regularized subspace segmentation
    Chen, Xiaoyun
    Jian, Cairen
    NEUROCOMPUTING, 2014, 143 : 44 - 50
  • [48] Kernel-based clustering via Isolation Distributional Kernel
    Zhu, Ye
    Ting, Kai Ming
    INFORMATION SYSTEMS, 2023, 117
  • [49] A Gene Selection Method for Microarray Data Based on Binary PSO Encoding Gene-to-Class Sensitivity Information
    Han, Fei
    Yang, Chun
    Wu, Ya-Qi
    Zhu, Jian-Sheng
    Ling, Qing-Hua
    Song, Yu-Qing
    Huang, De-Shuang
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2017, 14 (01) : 85 - 96
  • [50] An Integrated Feature Selection Algorithm for Cancer Classification using Gene Expression Data
    Ahmed, Saeed
    Kabir, Muhammad
    Ali, Zakir
    Arif, Muhammad
    Ali, Farman
    Yu, Dong-Jun
    COMBINATORIAL CHEMISTRY & HIGH THROUGHPUT SCREENING, 2018, 21 (09) : 631 - 645