Ensemble classification for gene expression data based on parallel clustering

被引:1
|
作者
Meng, Jun [1 ]
Jiang, Dingling [1 ]
Zhang, Jing [1 ]
Luan, Yushi [2 ]
机构
[1] Dalian Univ Technol, Sch Comp Sci & Technol, Dalian, Peoples R China
[2] Dalian Univ Technol, Sch Life Sci & Biotechnol, Dalian, Peoples R China
基金
中国国家自然科学基金;
关键词
ensemble classification; microarray data; MapReduce programming model; parallel information fusion; MICROARRAY; SELECTION; STRESS; CANCER; MODEL;
D O I
10.1504/IJDMB.2018.094779
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Analysis of large-scale gene expression data is a research hotspot in the field of bioinformatics, which can be used to study abnormal phenomenon in plant growth process. This paper proposes a biological knowledge integration method based on parallel clustering to select gene subsets effectively. Gene ontology is utilised to obtain the biological functional similarity, and combined with gene expression data. Parallelised affinity propagation algorithm is used to cluster data since it can not only obtain more biologically meaningful subsets, but also avoid the loss of some potential value in genes from simple gene primary selection. The algorithm is verified with four typical plant datasets and compared with other well-known integration methods. Experimental results on plant stress response datasets demonstrate that the proposed method can select genes with stronger classification ability.
引用
收藏
页码:213 / 229
页数:17
相关论文
共 50 条
  • [41] New ensemble machine learning method for classification and prediction on gene expression data
    Wang, Ching Wei
    2006 28TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-15, 2006, : 60 - 63
  • [42] Large-scale gene expression data clustering through incremental ensemble approach
    Khan, Imran
    Shaikh, Abdul Khalique
    Adhikari, Naresh
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2024, 5 (04):
  • [43] A support vector machine ensemble for cancer classification using gene expression data
    Liao, Chen
    Li, Shutao
    BIOINFORMATICS RESEARCH AND APPLICATIONS, PROCEEDINGS, 2007, 4463 : 488 - +
  • [44] Model-based clustering and data transformations for gene expression data
    Yeung, KY
    Fraley, C
    Murua, A
    Raftery, AE
    Ruzzo, WL
    BIOINFORMATICS, 2001, 17 (10) : 977 - 987
  • [45] Classification of Microarray Gene Expression Data using Weighted Grey Wolf Optimizer based Fuzzy Clustering
    Achom, Amika
    Das, Ranjita
    Pakray, Partha
    Saha, Sriparna
    PROCEEDINGS OF THE 2019 IEEE REGION 10 CONFERENCE (TENCON 2019): TECHNOLOGY, KNOWLEDGE, AND SOCIETY, 2019, : 2705 - 2710
  • [46] Parallel clustering algorithm of gene expression patterns
    Supercomputing Center, Computer Network Information Center, Chinese Academy of Sciences, Beijing 100080, China
    不详
    Jisuanji Xuebao, 2007, 2 (311-316):
  • [47] A clustering-based adaptive undersampling ensemble method for highly unbalanced data classification
    Yuan, Xiaohan
    Sun, Chuan
    Chen, Shuyu
    APPLIED SOFT COMPUTING, 2024, 159
  • [48] Gene expression data clustering and visualization based on a binary hierarchical clustering framework
    Szeto, LK
    Liew, AWC
    Yan, H
    Tang, SS
    JOURNAL OF VISUAL LANGUAGES AND COMPUTING, 2003, 14 (04): : 341 - 362
  • [49] Gene expression data clustering using a multiobjective symmetry based clustering technique
    Saha, Sriparna
    Ekbal, Asif
    Gupta, Kshitija
    Bandyopadhyay, Sanghamitra
    COMPUTERS IN BIOLOGY AND MEDICINE, 2013, 43 (11) : 1965 - 1977
  • [50] Hierarchical clustering combining numerical and biological similarities for gene expression data classification
    Bosio, Mattia
    Salembier, Philippe
    Bellot, Pau
    Oliveras-Verges, Albert
    2013 35TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2013, : 584 - 587