FABIA: factor analysis for bicluster acquisition

被引:221
作者
Hochreiter, Sepp [1 ]
Bodenhofer, Ulrich [1 ]
Heusel, Martin [1 ]
Mayr, Andreas [1 ]
Mitterecker, Andreas [1 ]
Kasim, Adetayo [2 ]
Khamiakova, Tatsiana [2 ]
Van Sanden, Suzy [2 ]
Lin, Dan [2 ]
Talloen, Willem [3 ]
Bijnens, Luc [3 ]
Gohlmann, Hinrich W. H. [3 ]
Shkedy, Ziv [2 ]
Clevert, Djork-Arne [1 ,4 ]
机构
[1] Johannes Kepler Univ Linz, Inst Bioinformat, A-4040 Linz, Austria
[2] Hasselt Univ, Inst Biostat & Stat Bioinformat, Hasselt, Belgium
[3] Johnson & Johnson Pharmaceut Res & Dev, Div Janssen Pharmaceut, Beerse, Belgium
[4] Charite, Dept Nephrol & Internal Intens Care, Berlin, Germany
关键词
GENE-EXPRESSION DATA; MICROARRAY DATA; TIME-SERIES; ALGORITHM; MODULES; MATRIX;
D O I
10.1093/bioinformatics/btq227
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Biclustering of transcriptomic data groups genes and samples simultaneously. It is emerging as a standard tool for extracting knowledge from gene expression measurements. We propose a novel generative approach for biclustering called 'FABIA: Factor Analysis for Bicluster Acquisition'. FABIA is based on a multiplicative model, which accounts for linear dependencies between gene expression and conditions, and also captures heavy-tailed distributions as observed in real-world transcriptomic data. The generative framework allows to utilize well-founded model selection methods and to apply Bayesian techniques. Results: On 100 simulated datasets with known true, artificially implanted biclusters, FABIA clearly outperformed all 11 competitors. On these datasets, FABIA was able to separate spurious biclusters from true biclusters by ranking biclusters according to their information content. FABIA was tested on three microarray datasets with known subclusters, where it was two times the best and once the second best method among the compared biclustering approaches.
引用
收藏
页码:1520 / 1527
页数:8
相关论文
共 48 条
[1]  
[Anonymous], 2002, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, DOI DOI 10.1145/564691.564737
[2]  
[Anonymous], 1984, An Introduction to Latent Variable Models
[3]   BicAT: a biclustering analysis toolbox [J].
Barkow, S ;
Bleuler, S ;
Prelic, A ;
Zimmermann, P ;
Zitzler, E .
BIOINFORMATICS, 2006, 22 (10) :1282-1283
[4]   Discovering local structure in gene expression data: The order-preserving submatrix problem [J].
Ben-Dor, A ;
Chor, B ;
Karp, R ;
Yakhini, Z .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2003, 10 (3-4) :373-384
[5]  
Bithas P. S., 2007, P INT C APPL STOCH M, V12
[6]  
BUSYGIN S, 2002, P 2 SIAM INT C DAT M
[7]  
Caldera-Serrano J., 2008, El Profesional de La Informacion, V18, P291, DOI [10.3145/epi.2009.may.06, DOI 10.3145/EPI.2009.MAY.06]
[8]  
Califano A, 2000, Proc Int Conf Intell Syst Mol Biol, V8, P75
[9]  
Cheng Y., 2000, Proceedings International Conference on Intelligent System,s for Molecular Biology
[10]  
ISMB. International Conference on Intelligent System, V8, P93