The "Gene Cube": A Novel Approach to Three-dimensional Clustering of Gene Expression Data

被引:9
作者
Lambrou, George I. [1 ,2 ]
Sdraka, Maria [2 ]
Koutsouris, Dimitrios [2 ]
机构
[1] Natl & Kapodistrian Univ Athens, Dept Pediat 1, Choremeio Res Lab, Thivon & Levadeias 8, Athens 11527, Greece
[2] Natl & Kapodistrian Univ Athens, Sch Elect & Comp Engn, Biomed Engn Lab, Heroon Polytecniou 9, Athens 15780, Greece
关键词
Machine learning; clustering; chromosomes; gene expression; DNA microarrays; algorithms; CHROMOSOMAL DOMAINS; MICROARRAY; ALGORITHM; ARRAY;
D O I
10.2174/1574893614666190116170406
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: A very popular technique for isolating significant genes from cancerous tissues is the application of various clustering algorithms on data obtained by DNA microarray experiments. Aim: The objective of the present work is to take into consideration the chromosomal identity of every gene before the clustering, by creating a three-dimensional structure of the form ChromosomesxGenesxSamples. Further on, the k-Means algorithm and a triclustering technique called delta-TRIMAX, are applied independently on the structure. Materials and Methods: The present algorithm was developed using the Python programming language (v. 3.5.1). For this work, we used two distinct public datasets containing healthy control samples and tissue samples from bladder cancer patients. Background correction was performed by subtracting the median global background from the median local Background from the signal intensity. The quantile normalization method has been applied for sample normalization. Three known algorithms have been applied for testing the "gene cube", a classical k-means, a transformed 3D k-means and the delta-TRIMAX. Results: Our proposed data structure consists of a 3D matrix of the form ChromosomesxGenesxSamples. Clustering analysis of that structure manifested very good results as we were able to identify gene expression patterns among samples, genes and chromosomes. Discussion: to the best of our knowledge, this is the first time that such a structure is reported and it consists of a useful tool towards gene classification from high-throughput gene expression experiments. Conclusions: Such approaches could prove useful towards the understanding of disease mechanics and tumors in particular.
引用
收藏
页码:721 / 727
页数:7
相关论文
共 53 条
  • [1] Analysis of data from viral DNA microchips
    Amaratunga, D
    Cabrera, J
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) : 1161 - 1170
  • [2] [Anonymous], 2004, Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
  • [3] [Anonymous], 1999, Imputing Missing Data for Gene Expression Arrays
  • [4] The ParTriCluster algorithm for gene expression analysis
    Araujo, Renata Braga
    Trielli Ferreira, Guilherme Henrique
    Orair, Gustavo Henrique
    Meira, Wagner, Jr.
    Celso Ferreira, Renato Antonio
    Guedes Neto, Dorgival Olavo
    Zaki, Mohammed Javeed
    [J]. INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2008, 36 (02) : 226 - 249
  • [5] Arthur D., P 18 ANN ACM SIAM S
  • [6] A CLUSTERING TECHNIQUE FOR SUMMARIZING MULTIVARIATE DATA
    BALL, GH
    HALL, DJ
    [J]. BEHAVIORAL SCIENCE, 1967, 12 (02): : 153 - &
  • [7] Coexpression and coregulation analysis of time-series gene expression data in estrogen-induced breast cancer cell
    Bhar, Anirban
    Haubrock, Martin
    Mukhopadhyay, Anirban
    Maulik, Ujjwal
    Bandyopadhyay, Sanghamitra
    Wingender, Edgar
    [J]. ALGORITHMS FOR MOLECULAR BIOLOGY, 2013, 8
  • [8] Bolstad B., 2001, CELL, P1
  • [9] A comparison of normalization methods for high density oligonucleotide array data based on variance and bias
    Bolstad, BM
    Irizarry, RA
    Åstrand, M
    Speed, TP
    [J]. BIOINFORMATICS, 2003, 19 (02) : 185 - 193
  • [10] Gene expression profiles of prostate cancer reveal involvement of multiple molecular pathways in the metastatic process
    Chandran, Uma R.
    Ma, Changqing
    Dhir, Rajiv
    Bisceglia, Michelle
    Lyons-Weiler, Maureen
    Liang, Wenjing
    Michalopoulos, George
    Becich, Michael
    Monzon, Federico A.
    [J]. BMC CANCER, 2007, 7 (1)