FaPN: Feature-aligned Pyramid Network for Dense Image Prediction

被引:154
作者
Huang, Shihua [1 ]
Lu, Zhichao [1 ]
Cheng, Ran [1 ]
He, Cheng [1 ]
机构
[1] Southern Univ Sci & Technol, Dept Comp Sci & Engn, Shenzhen, Peoples R China
来源
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021) | 2021年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/ICCV48922.2021.00090
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent advancements in deep neural networks have made remarkable leap-forwards in dense image prediction. However, the issue of feature alignment remains as neglected by most existing approaches for simplicity. Direct pixel addition between upsampled and local features leads to feature maps with misaligned contexts that, in turn, translate to mis-classifications in prediction, especially on object boundaries. In this paper, we propose a feature alignment module that learns transformation offsets of pixels to contextually align upsampled higher-level features; and another feature selection module to emphasize the lower-level features with rich spatial details. We then integrate these two modules in a top-down pyramidal architecture and present the Feature-aligned Pyramid Network (FaPN). Extensive experimental evaluations on four dense prediction tasks and four datasets have demonstrated the efficacy of FaPN, yielding an overall improvement of 1.2 - 2.6 points in AP / mIoU over FPN when paired with Faster / Mask R-CNN. In particular, our FaPN achieves the state-of-the-art of 56.7% mIoU on ADE20K when integrated within Mask-Former. The code is available from https://github.com/EMIGroup/FaPN.
引用
收藏
页码:844 / 853
页数:10
相关论文
共 51 条
[1]   Augmented Reality Meets Computer Vision: Efficient Data Generation for Urban Driving Scenes [J].
Abu Alhaija, Hassan ;
Mustikovela, Siva Karthik ;
Mescheder, Lars ;
Geiger, Andreas ;
Rother, Carsten .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2018, 126 (09) :961-972
[2]  
[Anonymous], 2019, ECCV, DOI DOI 10.1007/S13143-018-0064-5
[3]  
[Anonymous], 2018, CVPR, DOI DOI 10.1109/CVPR.2018.00132
[4]  
[Anonymous], 2019, CVPR, DOI DOI 10.1109/CVPR.2019.00656
[5]  
[Anonymous], 2016, EVID BASED COMPLEMEN, DOI DOI 10.2174/1871520615666150817115913
[6]  
[Anonymous], 2019, CVPR, DOI DOI 10.1109/CVPR.2019.00656
[7]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[8]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[9]  
Chen LB, 2017, IEEE INT SYMP NANO, P1, DOI 10.1109/NANOARCH.2017.8053709
[10]  
Cheng Bowen, 2021, ADV NEURAL INF PROCE, P2021