Identifying Subspace Gene Clusters from Microarray Data Using Low-Rank Representation

被引:13
作者
Cui, Yan [1 ]
Zheng, Chun-Hou [2 ]
Yang, Jian [1 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Technol, Nanjing, Jiangsu, Peoples R China
[2] Anhui Univ, Coll Elect Engn & Automat, Hefei 230039, Anhui, Peoples R China
来源
PLOS ONE | 2013年 / 8卷 / 03期
基金
中国国家自然科学基金;
关键词
TRANSCRIPTIONAL MODULES; EXPRESSION DATA; COEXPRESSION; INFORMATION; DISCOVERY; COREGULATION;
D O I
10.1371/journal.pone.0059377
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Identifying subspace gene clusters from the gene expression data is useful for discovering novel functional gene interactions. In this paper, we propose to use low-rank representation (LRR) to identify the subspace gene clusters from microarray data. LRR seeks the lowest-rank representation among all the candidates that can represent the genes as linear combinations of the bases in the dataset. The clusters can be extracted based on the block diagonal representation matrix obtained using LRR, and they can well capture the intrinsic patterns of genes with similar functions. Meanwhile, the parameter of LRR can balance the effect of noise so that the method is capable of extracting useful information from the data with high level of background noise. Compared with traditional methods, our approach can identify genes with similar functions yet without similar expression profiles. Also, it could assign one gene into different clusters. Moreover, our method is robust to the noise and can identify more biologically relevant gene clusters. When applied to three public datasets, the results show that the LRR based method is superior to existing methods for identifying subspace gene clusters.
引用
收藏
页数:14
相关论文
共 49 条
[1]  
Agrawal R., 1998, SIGMOD Record, V27, P94, DOI 10.1145/276305.276314
[2]   Quantifying the relationship between co-expression, co-regulation and gene function [J].
Allocco, DJ ;
Kohane, IS ;
Butte, AJ .
BMC BIOINFORMATICS, 2004, 5 (1)
[3]   Network motifs: theory and experimental approaches [J].
Alon, Uri .
NATURE REVIEWS GENETICS, 2007, 8 (06) :450-461
[4]  
[Anonymous], 2002, THESIS STANFORD U
[5]  
[Anonymous], IEEE T PATTERN ANAL
[6]  
[Anonymous], 2010, INT C MACHINE LEARNI
[7]   MODEL-BASED GAUSSIAN AND NON-GAUSSIAN CLUSTERING [J].
BANFIELD, JD ;
RAFTERY, AE .
BIOMETRICS, 1993, 49 (03) :803-821
[8]   Computational discovery of gene modules and regulatory networks [J].
Bar-Joseph, Z ;
Gerber, GK ;
Lee, TI ;
Rinaldi, NJ ;
Yoo, JY ;
Robert, F ;
Gordon, DB ;
Fraenkel, E ;
Jaakkola, TS ;
Young, RA ;
Gifford, DK .
NATURE BIOTECHNOLOGY, 2003, 21 (11) :1337-1342
[9]  
Ben-Dor A, 2003, J COMPUT BIOL, V10, P803
[10]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300