Finding Correlated Biclusters from Gene Expression Data

被引:33
|
作者
Yang, Wen-Hui [1 ,2 ]
Dai, Dao-Qing [1 ,2 ]
Yan, Hong [3 ,4 ]
机构
[1] Sun Yat Sen Zhongshan Univ, Ctr Comp Vis, Guangzhou 510275, Guangdong, Peoples R China
[2] Sun Yat Sen Zhongshan Univ, Dept Math, Fac Math & Comp, Guangzhou 510275, Guangdong, Peoples R China
[3] City Univ Hong Kong, Dept Elect Engn, Kowloon, Hong Kong, Peoples R China
[4] Univ Sydney, Sch Elect & Informat Engn, Sydney, NSW 2006, Australia
关键词
Biclustering; pattern classification; gene expression data; singular-value decomposition; data mining; biology computing; SINGULAR-VALUE DECOMPOSITION; MICROARRAY DATA; DISCRIMINANT-ANALYSIS; CLUSTER-ANALYSIS; PATTERNS; MODELS;
D O I
10.1109/TKDE.2010.150
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Extracting biologically relevant information from DNA microarrays is a very important task for drug development and test, function annotation, and cancer diagnosis. Various clustering methods have been proposed for the analysis of gene expression data, but when analyzing the large and heterogeneous collections of gene expression data, conventional clustering algorithms often cannot produce a satisfactory solution. Biclustering algorithm has been presented as an alternative approach to standard clustering techniques to identify local structures from gene expression data set. These patterns may provide clues about the main biological processes associated with different physiological states. In this paper, different from existing bicluster patterns, we first introduce a more general pattern: correlated bicluster, which has intuitive biological interpretation. Then, we propose a novel transform technique based on singular value decomposition so that identifying correlated-bicluster problem from gene expression matrix is transformed into two global clustering problems. The Mixed-Clustering algorithm and the Lift algorithm are devised to efficiently produce delta-corBiclusters. The biclusters obtained using our method from gene expression data sets of multiple human organs and the yeast Saccharomyces cerevisiae demonstrate clear biological meanings.
引用
收藏
页码:568 / 584
页数:17
相关论文
共 50 条
  • [31] Correlating gene and protein expression data using Correlated Factor Analysis
    Chuen Seng Tan
    Agus Salim
    Alexander Ploner
    Janne Lehtiö
    Kee Seng Chia
    Yudi Pawitan
    BMC Bioinformatics, 10
  • [32] Correlating gene and protein expression data using Correlated Factor Analysis
    Tan, Chuen Seng
    Salim, Agus
    Ploner, Alexander
    Lehtio, Janne
    Chia, Kee Seng
    Pawitan, Yudi
    BMC BIOINFORMATICS, 2009, 10 : 272
  • [33] Detection of Significant Biclusters in Gene Expression Data using Reactive Greedy Randomized Adaptive Search Algorithm
    Dharan, Smitha
    Nair, Achuthsankar S.
    13TH INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING, VOLS 1-3, 2009, 23 (1-3): : 631 - 634
  • [34] Biclustering Analysis on Class Discovery From Gene Expression Data
    Anitha, S.
    Chandran, C. P.
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATIONS TECHNOLOGIES (ICCCT 15), 2015, : 55 - 60
  • [35] NBF: An FCA-based Algorithm to Identify Negative Correlation Biclusters of DNA Microarray Data
    Houari, Amina
    Ayadi, Wassim
    Ben Yahia, Sadok
    PROCEEDINGS 2018 IEEE 32ND INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS (AINA), 2018, : 1003 - 1010
  • [36] FINDING LARGE AVERAGE SUBMATRICES IN HIGH DIMENSIONAL DATA
    Shabalin, Andrey A.
    Weigman, Victor J.
    Perou, Charles M.
    Nobel, Andrew B.
    ANNALS OF APPLIED STATISTICS, 2009, 3 (03) : 985 - 1012
  • [37] On Biclustering of Gene Expression Data
    Mukhopadhyay, Anirban
    Maulik, Ujjwal
    Bandyopadhyay, Sanghamitra
    CURRENT BIOINFORMATICS, 2010, 5 (03) : 204 - 216
  • [38] Discretization of gene expression data revised
    Gallo, Cristian A.
    Cecchini, Rocio L.
    Carballido, Jessica A.
    Micheletto, Sandra
    Ponzoni, Ignacio
    BRIEFINGS IN BIOINFORMATICS, 2016, 17 (05) : 758 - 770
  • [39] SVD analysis of gene expression data
    Simek, Krzysztof
    Jarzab, Michal
    MATHEMATICAL MODELING OF BIOLOGICAL SYSTEMS, VOL I: CELLULAR BIOPHYSICS, REGULATORY NETWORKS, DEVELOPMENT, BIOMEDICINE, AND DATA ANALYSIS, 2007, : 361 - +
  • [40] MSVD-MOEB Algorithm Applied to Cancer Gene Expression Data
    Wang, Duo
    Zheng, Hongjun
    2015 IEEE 7TH INTERNATIONAL CONFERENCE ON AWARENESS SCIENCE & TECHNOLOGY (ICAST), 2015, : 119 - 124