Clustering with shallow trees

被引:8
作者
Bailly-Bechet, M. [1 ]
Bradde, S. [2 ,3 ]
Braunstein, A. [4 ]
Flaxman, A. [5 ]
Foini, L. [2 ,3 ]
Zecchina, R. [4 ]
机构
[1] Univ Lyon 1, CNRS, UMR 5558, Lab Biometrie & Biol Evolut, F-69622 Villeurbanne, France
[2] SISSA, I-34014 Trieste, Italy
[3] Ist Nazl Fis Nucl, Sez Trieste, Trieste, Italy
[4] Politecn Torino, Turin, Italy
[5] Univ Washington, IHME, Seattle, WA 98195 USA
关键词
cavity and replica method; message-passing algorithms; SEQUENCES;
D O I
10.1088/1742-5468/2009/12/P12010
中图分类号
O3 [力学];
学科分类号
08 ; 0801 ;
摘要
We propose a new method for obtaining hierarchical clustering based on the optimization of a cost function over trees of limited depth, and we derive a message-passing method that allows one to use it efficiently. The method and the associated algorithm can be interpreted as a natural interpolation between two well-known approaches, namely that of single linkage and the recently presented affinity propagation. We analyse using this general scheme three biological/medical structured data sets (human population based on genetic information, proteins based on sequences and verbal autopsies) and show that the interpolation technique provides new insight.
引用
收藏
页数:17
相关论文
共 20 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]   Machine learning methods for predictive proteomics [J].
Barla, Annalisa ;
Jurman, Giuseppe ;
Riccadonna, Samantha ;
Merler, Stefano ;
Chierici, Marco ;
Furlanello, Cesare .
BRIEFINGS IN BIOINFORMATICS, 2008, 9 (02) :119-128
[3]   Statistical mechanics of Steiner trees [J].
Bayati, M. ;
Borgs, C. ;
Braunstein, A. ;
Chayes, J. ;
Ramezanpour, A. ;
Zecchina, R. .
PHYSICAL REVIEW LETTERS, 2008, 101 (03)
[4]   A rigorous analysis of the cavity equations for the minimum spanning tree [J].
Bayati, Mohsen ;
Braunstein, Alfredo ;
Zecchina, Riccardo .
JOURNAL OF MATHEMATICAL PHYSICS, 2008, 49 (12)
[5]  
BRADDE S, IN PRESS
[6]   Learning by message passing in networks of discrete synapses [J].
Braunstein, A ;
Zecchina, R .
PHYSICAL REVIEW LETTERS, 2006, 96 (03)
[7]   Inference from clustering with application to gene-expression microarrays [J].
Dougherty, ER ;
Barrera, J ;
Brun, M ;
Kim, S ;
Cesar, RM ;
Chen, YD ;
Bittner, M ;
Trent, JM .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2002, 9 (01) :105-126
[8]   Cluster analysis and display of genome-wide expression patterns [J].
Eisen, MB ;
Spellman, PT ;
Brown, PO ;
Botstein, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) :14863-14868
[9]   A second generation human haplotype map of over 3.1 million SNPs [J].
Frazer, Kelly A. ;
Ballinger, Dennis G. ;
Cox, David R. ;
Hinds, David A. ;
Stuve, Laura L. ;
Gibbs, Richard A. ;
Belmont, John W. ;
Boudreau, Andrew ;
Hardenbol, Paul ;
Leal, Suzanne M. ;
Pasternak, Shiran ;
Wheeler, David A. ;
Willis, Thomas D. ;
Yu, Fuli ;
Yang, Huanming ;
Zeng, Changqing ;
Gao, Yang ;
Hu, Haoran ;
Hu, Weitao ;
Li, Chaohua ;
Lin, Wei ;
Liu, Siqi ;
Pan, Hao ;
Tang, Xiaoli ;
Wang, Jian ;
Wang, Wei ;
Yu, Jun ;
Zhang, Bo ;
Zhang, Qingrun ;
Zhao, Hongbin ;
Zhao, Hui ;
Zhou, Jun ;
Gabriel, Stacey B. ;
Barry, Rachel ;
Blumenstiel, Brendan ;
Camargo, Amy ;
Defelice, Matthew ;
Faggart, Maura ;
Goyette, Mary ;
Gupta, Supriya ;
Moore, Jamie ;
Nguyen, Huy ;
Onofrio, Robert C. ;
Parkin, Melissa ;
Roy, Jessica ;
Stahl, Erich ;
Winchester, Ellen ;
Ziaugra, Liuda ;
Altshuler, David ;
Shen, Yan .
NATURE, 2007, 449 (7164) :851-U3
[10]   Clustering by passing messages between data points [J].
Frey, Brendan J. ;
Dueck, Delbert .
SCIENCE, 2007, 315 (5814) :972-976