Invariant Scattering Convolution Networks

被引:1063
作者
Bruna, Joan [1 ]
Mallat, Stephane [2 ]
机构
[1] NYU, Courant Inst, New York, NY 10003 USA
[2] Ecole Normale Super, F-75005 Paris, France
关键词
Classification; convolution networks; deformations; invariants; wavelets; MODELS;
D O I
10.1109/TPAMI.2012.230
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A wavelet scattering network computes a translation invariant image representation which is stable to deformations and preserves high-frequency information for classification. It cascades wavelet transform convolutions with nonlinear modulus and averaging operators. The first network layer outputs SIFT-type descriptors, whereas the next layers provide complementary invariant information that improves classification. The mathematical analysis of wavelet scattering networks explains important properties of deep convolution networks for classification. A scattering representation of stationary processes incorporates higher order moments and can thus discriminate textures having the same Fourier power spectrum. State-of-the-art classification results are obtained for handwritten digits and texture discrimination, with a Gaussian kernel SVM and a generative PCA classifier.
引用
收藏
页码:1872 / 1886
页数:15
相关论文
共 37 条
  • [11] Broadhurst R.E., 2005, P WORKSH TEXT AN SYN
  • [12] Bruna J., 2012, THESIS CMAP
  • [13] LIBSVM: A Library for Support Vector Machines
    Chang, Chih-Chung
    Lin, Chih-Jen
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
  • [14] Using Basic Image Features for Texture Classification
    Crosier, M.
    Griffin, L. D.
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (03) : 447 - 460
  • [15] Fei-Fei Li, 2004, COMPUT VIS IMAGE UND, P178, DOI [DOI 10.1016/J.CVIU.2005.09.012, DOI 10.1109/CVPR.2004.383]
  • [16] Rotation invariant texture classification using LBP variance (LBPV) with global matching
    Guo, Zhenhua
    Zhang, Lei
    Zhang, David
    [J]. PATTERN RECOGNITION, 2010, 43 (03) : 706 - 719
  • [17] Haasdonk B., 2002, P 16 INT C PATT REC
  • [18] Hayman E., 2004, P EUR C COMP VIS
  • [19] What is the Best Multi-Stage Architecture for Object Recognition?
    Jarrett, Kevin
    Kavukcuoglu, Koray
    Ranzato, Marc'Aurelio
    LeCun, Yann
    [J]. 2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2009, : 2146 - 2153
  • [20] Deformation models for image recognition
    Keysers, Daniel
    Deselaers, Thomas
    Gollan, Christian
    Ney, Hermann
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2007, 29 (08) : 1422 - 1435