Tamed Warping Network for High-Resolution Semantic Video Segmentation

被引:0
作者
Li, Songyuan [1 ]
Feng, Junyi [1 ]
Li, Xi [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci, Hangzhou 310027, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 18期
关键词
semantic video segmentation; warping; NET;
D O I
10.3390/app131810102
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Recent approaches for fast semantic video segmentation have reduced redundancy by warping feature maps across adjacent frames, greatly speeding up the inference phase. However, the accuracy drops seriously owing to the errors incurred by warping. In this paper, we propose a novel framework and design a simple and effective correction stage after warping. Specifically, we build a non-key-frame CNN, fusing warped context features with current spatial details. Based on the feature fusion, our context feature rectification (CFR) module learns the model's difference from a per-frame model to correct the warped features. Furthermore, our residual-guided attention (RGA) module utilizes the residual maps in the compressed domain to help CRF focus on error-prone regions. Results on Cityscapes show that the accuracy significantly increases from 67.3% to 71.6%, and the speed edges down from 65.5 FPS to 61.8 FPS at a resolution of 1024x2048. For non-rigid categories, e.g., "human" and "object", the improvements are even higher than 18 percentage points.
引用
收藏
页数:17
相关论文
共 56 条
[1]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[2]   Stereo obstacle detection for unmanned surface vehicles by IMU-assisted semantic segmentation [J].
Bovcon, Borja ;
Mandeljc, Rok ;
Pers, Janez ;
Kristan, Matej .
ROBOTICS AND AUTONOMOUS SYSTEMS, 2018, 104 :1-13
[3]   Semantic object classes in video: A high-definition ground truth database [J].
Brostow, Gabriel J. ;
Fauqueur, Julien ;
Cipolla, Roberto .
PATTERN RECOGNITION LETTERS, 2009, 30 (02) :88-97
[4]  
Chen LC, 2017, Arxiv, DOI arXiv:1706.05587
[5]   Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].
Chen, Liang-Chieh ;
Zhu, Yukun ;
Papandreou, George ;
Schroff, Florian ;
Adam, Hartwig .
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851
[6]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223
[7]   MLFNet: Multi-Level Fusion Network for Real-Time Semantic Segmentation of Autonomous Driving [J].
Fan, Jiaqi ;
Wang, Fei ;
Chu, Hongqing ;
Hu, Xiao ;
Cheng, Yifan ;
Gao, Bingzhao .
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2023, 8 (01) :756-767
[8]  
Howard AG, 2017, Arxiv, DOI arXiv:1704.04861
[9]   Semantic Video CNNs through Representation Warping [J].
Gadde, Raghudeep ;
Jampani, Varun ;
Gehler, Peter V. .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :4463-4472
[10]   Real-Time Semantic Segmentation With Fast Attention [J].
Hu, Ping ;
Perazzi, Federico ;
Heilbron, Fabian Caba ;
Wang, Oliver ;
Lin, Zhe ;
Saenko, Kate ;
Sclaroff, Stan .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (01) :263-270