CLUSTERING AND CLASSIFICATION THROUGH NORMALIZING FLOWS IN FEATURE SPACE

被引:19
作者
Agnelli, J. P. [1 ]
Cadeiras, M. [2 ]
Tabak, E. G. [3 ]
Turner, C. V. [1 ]
Vanden-Eijnden, E. [3 ]
机构
[1] Natl Univ Cordoba, FAMAF, UN Cordoba, RA-5000 Cordoba, Argentina
[2] Univ Alabama Birmingham, Dept Med, Div Cardiol, Birmingham, AL 35294 USA
[3] NYU, Courant Inst, New York, NY 10012 USA
关键词
maximum likelihood; expectation maximization; Gaussianization; machine learning; density estimation; inference; PREDICTION;
D O I
10.1137/100783522
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
A unified variational methodology is developed for classification and clustering problems and is tested in the classification of tumors from gene expression data. It is based on fluid-like flows in feature space that cluster a set of observations by transforming them into likely samples from p isotropic Gaussians, where p is the number of classes sought. The methodology blurs the distinction between training and testing populations through the soft assignment of both to classes. The observations act as Lagrangian markers for the flows, comparatively active or passive depending on the current strength of the assignment to the corresponding class.
引用
收藏
页码:1784 / 1802
页数:19
相关论文
共 12 条
[1]  
[Anonymous], 2006, Pattern recognition and machine learning
[2]  
Chapelle Olivier, 2006, IEEE Transactions on Neural Networks, DOI DOI 10.1109/TNN.2009.2015974
[3]  
Chen SSB, 2001, ADV NEUR IN, V13, P423
[4]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[5]  
Dhillon I. S., 2003, Journal of Machine Learning Research, V3, P1265, DOI 10.1162/153244303322753661
[6]   PROJECTION PURSUIT DENSITY-ESTIMATION [J].
FRIEDMAN, JH ;
STUETZLE, W ;
SCHROEDER, A .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1984, 79 (387) :599-608
[7]   Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring [J].
Golub, TR ;
Slonim, DK ;
Tamayo, P ;
Huard, C ;
Gaasenbeek, M ;
Mesirov, JP ;
Coller, H ;
Loh, ML ;
Downing, JR ;
Caligiuri, MA ;
Bloomfield, CD ;
Lander, ES .
SCIENCE, 1999, 286 (5439) :531-537
[8]  
Guyon I., 2003, J MACH LEARN RES, V3, P1157
[9]   Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks [J].
Khan, J ;
Wei, JS ;
Ringnér, M ;
Saal, LH ;
Ladanyi, M ;
Westermann, F ;
Berthold, F ;
Schwab, M ;
Antonescu, CR ;
Peterson, C ;
Meltzer, PS .
NATURE MEDICINE, 2001, 7 (06) :673-679
[10]   ON INFORMATION AND SUFFICIENCY [J].
KULLBACK, S ;
LEIBLER, RA .
ANNALS OF MATHEMATICAL STATISTICS, 1951, 22 (01) :79-86