Spike and slab biclustering

被引：7

作者：

Denitto, M. ^{[1
]}

Bicego, M. ^{[1
]}

Farinelli, A. ^{[1
]}

Figueiredo, M. A. T. ^{[2
,3
]}

机构：

[1] Univ Verona, Str Le Grazie 15,Ca Vignal 2, Verona, Italy

[2] Univ Lisbon, Inst Telecomunicacoes, Ave Rovisco Pais 1, Lisbon, Portugal

[3] Univ Lisbon, Inst Super Tecn, Ave Rovisco Pais 1, Lisbon, Portugal

来源：

PATTERN RECOGNITION | 2017年 / 72卷

关键词：

Biclustering; Spike and slab; Probabilistic graphical models; Expectation-maximization; GENE-EXPRESSION DATA; NONNEGATIVE MATRIX FACTORIZATION; EM ALGORITHM; VARIABLE SELECTION; MICROARRAY DATA; SPARSE; DECOMPOSITION; MODELS;

D O I：

10.1016/j.patcog.2017.07.021

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Biclustering refers to the problem of simultaneously clustering the rows and columns of a given data matrix, with the goal of obtaining submatrices where the selected rows present a coherent behaviour in the selected columns, and vice-versa. To face this intrinsically difficult problem, we propose a novel generative model, where biclustering is approached from a sparse low-rank matrix factorization perspective. The main idea is to design a probabilistic model describing the factorization of a given data matrix in two other matrices, from which information about rows and columns belonging to the sought for biclusters can be obtained. One crucial ingredient in the proposed model is the use of a spike and slab sparsity inducing prior, thus we term the approach spike and slab biclustering (SSBi). To estimate the parameters of the SSBi model, we propose an expectation-maximization (EM) algorithm, termed SSBiEM, which solves a low-rank factorization problem at each iteration, using a recently proposed augmented Lagrangian algorithm. Experiments with both synthetic and real data show that the SSBi approach compares favorably with the state-of-the-art. (C) 2017 Elsevier Ltd. All rights reserved.

引用

页码：186 / 195

页数：10

共 61 条

[1] Ailem M., 2017, IEEE T KNOWL DATA EN
[2] [Anonymous], 2002, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, DOI DOI 10.1145/564691.564737
[3] Gene Ontology: tool for the unification of biology
Ashburner, M
Ball, CA
Blake, JA
Botstein, D
Butler, H
Cherry, JM
Davis, AP
Dolinski, K
Dwight, SS
Eppig, JT
Harris, MA
Hill, DP
Issel-Tarver, L
Kasarskis, A
Lewis, S
Matese, JC
Richardson, JE
Ringwald, M
Rubin, GM
Sherlock, G
[J]. NATURE GENETICS, 2000, 25 (01) : 25 - 29
[4] Investigating Topic Models' Capabilities in Expression Microarray Data Classification
Bicego, Manuele
Lovato, Pietro
Perina, Alessandro
Fasoli, Marianna
Delledonne, Massimo
Pezzotti, Mario
Polverari, Annalisa
Murino, Vittorio
[J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2012, 9 (06) : 1831 - 1836
[5] Sparse group factor analysis for biclustering of multiple data sources
Bunte, Kerstin
Leppaaho, Eemeli
Saarinen, Inka
Kaski, Samuel
[J]. BIOINFORMATICS, 2016, 32 (16) : 2457 - 2463
[6] Unifying Nuclear Norm and Bilinear Factorization Approaches for Low-rank Matrix Decomposition
Cabral, Ricardo
De la Torre, Fernando
Costeira, Joao P.
Bernardino, Alexandre
[J]. 2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 2488 - 2495
[7] Cheng Y, 2000, Proc Int Conf Intell Syst Mol Biol, V8, P93
[8] de Castro PAD, 2007, LECT NOTES COMPUT SC, V4628, P83
[9] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM
DEMPSTER, AP
LAIRD, NM
RUBIN, DB
[J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01): : 1 - 38
[10] Multiple Structure Recovery via Probabilistic Biclustering
Denitto, M.
Magri, L.
Farinelli, A.
Fusiello, A.
Bicego, M.
[J]. STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2016, 2016, 10029 : 274 - 284

← 1 2 3 4 5 6 7 →