Transformation-invariant clustering using the EM algorithm

被引:69
作者
Frey, BJ
Jojic, N
机构
[1] Univ Toronto, Dept Elect & Comp Engn, Toronto, ON M5S 3G4, Canada
[2] Microsoft Corp, Res, Vis Technol Grp, Redmond, WA 98052 USA
基金
加拿大自然科学与工程研究理事会; 美国国家科学基金会;
关键词
generative models; transformation; transformation-invariance; clustering; video summary; filtering; EM algorithm; probability model;
D O I
10.1109/TPAMI.2003.1159942
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is a simple, effective way to derive useful representations of data, such as images and videos. Clustering explains the input as one of several prototypes, plus noise. In situations where each input has been randomly transformed (e.g., by translation, rotation, and shearing in images and videos), clustering techniques tend to extract cluster centers that account for variations in the input due to transformations, instead of more interesting and potentially useful structure. For example, if images from a video sequence of a person walking across a cluttered background are clustered, it would be more useful for the different clusters to represent different poses and expressions, instead of different positions of the person and different configurations of the background clutter. We describe a way to add transformation invariance to mixture models, by approximating the nonlinear transformation manifold by a discrete set of points. We show how the expectation maximization algorithm can be used to jointly learn clusters, while at the same time inferring the transformation associated with each input. We compare this technique with other methods for filtering noisy images obtained from a scanning electron microscope, clustering images from videos of faces into different categories of identification and pose and removing foreground obstructions from video. We also demonstrate that the new technique is quite insensitive to initial conditions and works better than standard techniques, even when the standard techniques are provided with extra data.
引用
收藏
页码:1 / 17
页数:17
相关论文
共 27 条
  • [1] Shape quantization and recognition with randomized trees
    Amit, Y
    Geman, D
    [J]. NEURAL COMPUTATION, 1997, 9 (07) : 1545 - 1588
  • [2] [Anonymous], GRAPHICAL MODELS MAC
  • [3] [Anonymous], P EUR C COMP VIS
  • [4] GTM: The generative topographic mapping
    Bishop, CM
    Svensen, M
    Williams, CKI
    [J]. NEURAL COMPUTATION, 1998, 10 (01) : 215 - 234
  • [5] Robustly estimating changes in image appearance
    Black, MJ
    Fleet, DJ
    Yacoob, Y
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2000, 78 (01) : 8 - 31
  • [6] COMPETITION AND MULTIPLE CAUSE MODELS
    DAYAN, P
    ZEMEL, RS
    [J]. NEURAL COMPUTATION, 1995, 7 (03) : 565 - 579
  • [7] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM
    DEMPSTER, AP
    LAIRD, NM
    RUBIN, DB
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01): : 1 - 38
  • [8] Frey B. J., 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149), P416, DOI 10.1109/CVPR.1999.786972
  • [9] Variational learning in nonlinear Gaussian belief networks
    Frey, BJ
    Hinton, GE
    [J]. NEURAL COMPUTATION, 1999, 11 (01) : 193 - 213
  • [10] FREY BJ, 1999, P IEEE INT C COMP VI