Multi-scale Matching Networks for Semantic Correspondence

被引:14
作者
Zhao, Dongyang [1 ,2 ]
Song, Ziyang [1 ]
Ji, Zhenghao [1 ]
Zhao, Gangming [3 ]
Ge, Weifeng [1 ,2 ]
Yu, Yizhou [3 ]
机构
[1] Fudan Univ, Sch Comp Sci, Nebula AI Grp, Shanghai, Peoples R China
[2] Shanghai Key Lab Intelligent Informat Proc, Shanghai, Peoples R China
[3] Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
来源
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021) | 2021年
关键词
D O I
10.1109/ICCV48922.2021.00334
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep features have been proven powerful in building accurate dense semantic correspondences in various previous works. However, the multi-scale and pyramidal hierarchy of convolutional neural networks has not been well studied to learn discriminative pixel-level features for semantic correspondence. In this paper, we propose a multiscale matching network that is sensitive to tiny semantic differences between neighboring pixels. We follow the coarse-to-fine matching strategy and build a top-down feature and matching enhancement scheme that is coupled with the multi-scale hierarchy of deep convolutional neural networks. During feature enhancement, intra-scale enhancement fuses same-resolution feature maps from multiple layers together via local self-attention and cross-scale enhancement hallucinates higher-resolution feature maps along the top-down pathway. Besides, we learn complementary matching details at different scales thus the overall matching score is refined by features of different semantic levels gradually. Our multi-scale matching network can be trained end-to-end easily with few additional learnable parameters. Experimental results demonstrate that the proposed method achieves state-of-the-art performance on three popular benchmarks with high computational efficiency. The code has been released at https: //github.com/wintersun661/MMNet.
引用
收藏
页码:3334 / 3344
页数:11
相关论文
共 62 条
  • [1] Alt C, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P1388
  • [2] [Anonymous], 2018, ECCV, DOI DOI 10.1007/978-3-030-01249-6_33
  • [3] [Anonymous], 2018, ADV NEURAL INFORM PR
  • [4] Speeded-Up Robust Features (SURF)
    Bay, Herbert
    Ess, Andreas
    Tuytelaars, Tinne
    Van Gool, Luc
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2008, 110 (03) : 346 - 359
  • [6] Boski M, 2017, 2017 10TH INTERNATIONAL WORKSHOP ON MULTIDIMENSIONAL (ND) SYSTEMS (NDS)
  • [7] Carion N, 2020, EUR C COMP VIS, P213
  • [8] Cho Minsu, 2015, CVPR, DOI [DOI 10.1109/CVPR.2015.7298724, 10.1109/CVPR.2015.7298724]
  • [9] Choy CB, 2016, ADV NEUR IN, V29
  • [10] Histograms of oriented gradients for human detection
    Dalal, N
    Triggs, B
    [J]. 2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, : 886 - 893