Transformation-invariant clustering using the EM algorithm

被引：69

作者：

Frey, BJ

Jojic, N

机构：

[1] Univ Toronto, Dept Elect & Comp Engn, Toronto, ON M5S 3G4, Canada

[2] Microsoft Corp, Res, Vis Technol Grp, Redmond, WA 98052 USA

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2003年 / 25卷 / 01期

基金：

美国国家科学基金会; 加拿大自然科学与工程研究理事会;

关键词：

generative models; transformation; transformation-invariance; clustering; video summary; filtering; EM algorithm; probability model;

D O I：

10.1109/TPAMI.2003.1159942

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Clustering is a simple, effective way to derive useful representations of data, such as images and videos. Clustering explains the input as one of several prototypes, plus noise. In situations where each input has been randomly transformed (e.g., by translation, rotation, and shearing in images and videos), clustering techniques tend to extract cluster centers that account for variations in the input due to transformations, instead of more interesting and potentially useful structure. For example, if images from a video sequence of a person walking across a cluttered background are clustered, it would be more useful for the different clusters to represent different poses and expressions, instead of different positions of the person and different configurations of the background clutter. We describe a way to add transformation invariance to mixture models, by approximating the nonlinear transformation manifold by a discrete set of points. We show how the expectation maximization algorithm can be used to jointly learn clusters, while at the same time inferring the transformation associated with each input. We compare this technique with other methods for filtering noisy images obtained from a scanning electron microscope, clustering images from videos of faces into different categories of identification and pose and removing foreground obstructions from video. We also demonstrate that the new technique is quite insensitive to initial conditions and works better than standard techniques, even when the standard techniques are provided with extra data.

引用

页码：1 / 17

页数：17

共 27 条

[1] Shape quantization and recognition with randomized trees [J].

Amit, Y ;

Geman, D .

NEURAL COMPUTATION, 1997, 9 (07) :1545-1588

[2]

[Anonymous], GRAPHICAL MODELS MAC

[3]

[Anonymous], P EUR C COMP VIS

[4] GTM: The generative topographic mapping [J].

Bishop, CM ;

Svensen, M ;

Williams, CKI .

NEURAL COMPUTATION, 1998, 10 (01) :215-234

[5] Robustly estimating changes in image appearance [J].

Black, MJ ;

Fleet, DJ ;

Yacoob, Y .

COMPUTER VISION AND IMAGE UNDERSTANDING, 2000, 78 (01) :8-31

[6] COMPETITION AND MULTIPLE CAUSE MODELS [J].

DAYAN, P ;

ZEMEL, RS .

NEURAL COMPUTATION, 1995, 7 (03) :565-579

[7] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].

DEMPSTER, AP ;

LAIRD, NM ;

RUBIN, DB .

JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38

[8]

Frey B. J., 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149), P416, DOI 10.1109/CVPR.1999.786972

[9] Variational learning in nonlinear Gaussian belief networks [J].

Frey, BJ ;

Hinton, GE .

NEURAL COMPUTATION, 1999, 11 (01) :193-213

[10]

FREY BJ, 1999, P IEEE INT C COMP VI

← 1 2 3 →