FABIA: factor analysis for bicluster acquisition

被引:221
作者
Hochreiter, Sepp [1 ]
Bodenhofer, Ulrich [1 ]
Heusel, Martin [1 ]
Mayr, Andreas [1 ]
Mitterecker, Andreas [1 ]
Kasim, Adetayo [2 ]
Khamiakova, Tatsiana [2 ]
Van Sanden, Suzy [2 ]
Lin, Dan [2 ]
Talloen, Willem [3 ]
Bijnens, Luc [3 ]
Gohlmann, Hinrich W. H. [3 ]
Shkedy, Ziv [2 ]
Clevert, Djork-Arne [1 ,4 ]
机构
[1] Johannes Kepler Univ Linz, Inst Bioinformat, A-4040 Linz, Austria
[2] Hasselt Univ, Inst Biostat & Stat Bioinformat, Hasselt, Belgium
[3] Johnson & Johnson Pharmaceut Res & Dev, Div Janssen Pharmaceut, Beerse, Belgium
[4] Charite, Dept Nephrol & Internal Intens Care, Berlin, Germany
关键词
GENE-EXPRESSION DATA; MICROARRAY DATA; TIME-SERIES; ALGORITHM; MODULES; MATRIX;
D O I
10.1093/bioinformatics/btq227
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Biclustering of transcriptomic data groups genes and samples simultaneously. It is emerging as a standard tool for extracting knowledge from gene expression measurements. We propose a novel generative approach for biclustering called 'FABIA: Factor Analysis for Bicluster Acquisition'. FABIA is based on a multiplicative model, which accounts for linear dependencies between gene expression and conditions, and also captures heavy-tailed distributions as observed in real-world transcriptomic data. The generative framework allows to utilize well-founded model selection methods and to apply Bayesian techniques. Results: On 100 simulated datasets with known true, artificially implanted biclusters, FABIA clearly outperformed all 11 competitors. On these datasets, FABIA was able to separate spurious biclusters from true biclusters by ranking biclusters according to their information content. FABIA was tested on three microarray datasets with known subclusters, where it was two times the best and once the second best method among the compared biclustering approaches.
引用
收藏
页码:1520 / 1527
页数:8
相关论文
共 48 条
[41]   I/NI-calls for the exclusion of non-informative genes:: a highly effective filtering tool for microarray data [J].
Talloen, Willem ;
Clevert, Djork-Arne ;
Hochreiter, Sepp ;
Amaratunga, Dhammika ;
Bijnens, Luc ;
Kass, Stefan ;
Goehlmann, Hinrich W. H. .
BIOINFORMATICS, 2007, 23 (21) :2897-2902
[42]  
Tanay Amos, 2002, Bioinformatics, V18 Suppl 1, pS136
[43]   Interrelated two-way clustering: An unsupervised approach for gene expression data analysis [J].
Tang, C ;
Zhang, L ;
Zhang, AD ;
Ramanathan, M .
2ND ANNUAL IEEE INTERNATIONAL SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING, PROCEEDINGS, 2001, :41-48
[44]  
Tibshirani R., 1999, CLUSTERING METHODS A
[45]   Improved biclustering of microarray data demonstrated through systematic performance tests [J].
Turner, H ;
Bailey, T ;
Krzanowski, W .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2005, 48 (02) :235-254
[46]   Gene expression profiling predicts clinical outcome of breast cancer [J].
van't Veer, LJ ;
Dai, HY ;
van de Vijver, MJ ;
He, YDD ;
Hart, AAM ;
Mao, M ;
Peterse, HL ;
van der Kooy, K ;
Marton, MJ ;
Witteveen, AT ;
Schreiber, GJ ;
Kerkhoven, RM ;
Roberts, C ;
Linsley, PS ;
Bernards, R ;
Friend, SH .
NATURE, 2002, 415 (6871) :530-536
[47]  
VANDENBULCKE T, 2009, THESIS KATHOLIEKE UN
[48]   An improved biclustering method for analyzing gene expression profiles [J].
Yang, J ;
Wang, HX ;
Wang, W ;
Yu, PS .
INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2005, 14 (05) :771-789