Ensemble classification for gene expression data based on parallel clustering

被引:1
作者
Meng, Jun [1 ]
Jiang, Dingling [1 ]
Zhang, Jing [1 ]
Luan, Yushi [2 ]
机构
[1] Dalian Univ Technol, Sch Comp Sci & Technol, Dalian, Peoples R China
[2] Dalian Univ Technol, Sch Life Sci & Biotechnol, Dalian, Peoples R China
基金
中国国家自然科学基金;
关键词
ensemble classification; microarray data; MapReduce programming model; parallel information fusion; MICROARRAY; SELECTION; STRESS; CANCER; MODEL;
D O I
10.1504/IJDMB.2018.094779
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Analysis of large-scale gene expression data is a research hotspot in the field of bioinformatics, which can be used to study abnormal phenomenon in plant growth process. This paper proposes a biological knowledge integration method based on parallel clustering to select gene subsets effectively. Gene ontology is utilised to obtain the biological functional similarity, and combined with gene expression data. Parallelised affinity propagation algorithm is used to cluster data since it can not only obtain more biologically meaningful subsets, but also avoid the loss of some potential value in genes from simple gene primary selection. The algorithm is verified with four typical plant datasets and compared with other well-known integration methods. Experimental results on plant stress response datasets demonstrate that the proposed method can select genes with stronger classification ability.
引用
收藏
页码:213 / 229
页数:17
相关论文
共 44 条
  • [1] [Anonymous], QUANTITATIVE BIOL
  • [2] [Anonymous], COMPUTATIONAL COMPLE
  • [3] Calvo-Dmgz D, 2012, ADV INTEL SOFT COMPU, V154, P53
  • [4] Using Variable Precision Rough Set for Selection and Classification of Biological Knowledge Integrated in DNA Gene Expression
    Calvo-Dmgz, D.
    Galvez, J. F.
    Glez-Pena, D.
    Gomez-Meire, S.
    Fdez-Riverola, F.
    [J]. JOURNAL OF INTEGRATIVE BIOINFORMATICS, 2012, 9 (03)
  • [5] Selecting genes by test statistics
    Chen, DC
    Liu, ZQ
    Ma, XB
    Hua, D
    [J]. JOURNAL OF BIOMEDICINE AND BIOTECHNOLOGY, 2005, (02): : 132 - 138
  • [6] Integrating Biological Knowledge with Gene Expression Profiles for Survival Prediction of Cancer
    Chen, Xi
    Wang, Lily
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2009, 16 (02) : 265 - 278
  • [7] Tabu Search and Binary Particle Swarm Optimization for Feature Selection Using Microarray Data
    Chuang, Li-Yeh
    Yang, Cheng-Huei
    Yang, Cheng-Hong
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2009, 16 (12) : 1689 - 1703
  • [8] Introducing randomness into greedy ensemble pruning algorithms
    Dai, Qun
    Li, Meiling
    [J]. APPLIED INTELLIGENCE, 2015, 42 (03) : 406 - 429
  • [9] Transfer Prototype-Based Fuzzy Clustering
    Deng, Zhaohong
    Jiang, Yizhang
    Chung, Fu-Lai
    Ishibuchi, Hisao
    Choi, Kup-Sze
    Wang, Shitong
    [J]. IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2016, 24 (05) : 1210 - 1232
  • [10] EEW-SC: Enhanced Entropy-Weighting Subspace Clustering for high dimensional gene expression data clustering analysis
    Deng, Zhaohong
    Choi, Kup-Sze
    Chung, Fu-Lai
    Wang, Shitong
    [J]. APPLIED SOFT COMPUTING, 2011, 11 (08) : 4798 - 4806