Improving Semantic Segmentation via Video Propagation and Label Relaxation

被引：220

作者：

Zhu, Yi ^{[1
]}

Sapra, Karan ^{[2
]}

Reda, Fitsum A. ^{[2
]}

Shih, Kevin J. ^{[2
]}

Newsam, Shawn ^{[1
]}

Tao, Andrew ^{[2
]}

Catanzaro, Bryan ^{[2
]}

机构：

[1] Univ Calif Merced, Merced, CA 95343 USA

[2] Nvidia Corp, Santa Clara, CA USA

来源：

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) | 2019年

关键词：

D O I：

10.1109/CVPR.2019.00906

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Semantic segmentation requires large amounts of pixel-wise annotations to learn accurate models. In this paper, we present a video prediction-based methodology to scale up training sets by synthesizing new training samples in order to improve the accuracy of semantic segmentation networks. We exploit video prediction models' ability to predict future frames in order to also predict future labels. A joint propagation strategy is also proposed to alleviate mis-alignments in synthesized samples. We demonstrate that training segmentation models on datasets augmented by the synthesized samples leads to significant improvements in accuracy. Furthermore, we introduce a novel boundary label relaxation technique that makes training robust to annotation noise and propagation artifacts along object boundaries. Our proposed methods achieve state-of-the-art mIoUs of 83.5% on Cityscapes and 82.9% on CamVid. Our single model, without model ensembles, achieves 72.8% mIoU on the KITTI semantic segmentation test set, which surpasses the winning entry of the ROB challenge 2018.

引用

页码：8848 / 8857

页数：10

共 50 条

[41] Improving Semantic Video Retrieval via Object-Based Features
Muehling, Markus
Ewerth, Ralph
Freisleben, Bernd
2009 IEEE THIRD INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2009), 2009, : 109 - 115
[42] Semantic Segmentation Facilitates Semantic Communication in Surveillance Video
Ma, Wenbo
Xie, Yu
Wang, Congyan
Zheng, Kaipeng
Chen, Mingkai
2024 IEEE/CIC INTERNATIONAL CONFERENCE ON COMMUNICATIONS IN CHINA, ICCC, 2024,
[43] Self-supervised video object segmentation via pseudo label rectification
Guo, Pinxue
Zhang, Wei
Li, Xiaoqiang
Fan, Jianping
Zhang, Wenqiang
PATTERN RECOGNITION, 2025, 163
[44] A pothole video dataset for semantic segmentation
Ihsan, Muhammad
Amrizal, Muhammad Alfian
Harjoko, Agus
DATA IN BRIEF, 2024, 53
[45] Semantic segmentation and description for video transcoding
Cavallaro, A
Steiger, O
Ebrahimi, T
2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL III, PROCEEDINGS, 2003, : 597 - 600
[46] Semantic video scene segmentation and transfer
Gritti, Tommaso
Damkat, Chris
Monaci, Gianluca
COMPUTER VISION AND IMAGE UNDERSTANDING, 2014, 122 : 172 - 181
[47] Clockwork Convnets for Video Semantic Segmentation
Shelhamer, Evan
Rakelly, Kate
Hoffman, Judy
Darrell, Trevor
COMPUTER VISION - ECCV 2016 WORKSHOPS, PT III, 2016, 9915 : 852 - 868
[48] Deep Video Dehazing With Semantic Segmentation
Ren, Wenqi
Zhang, Jingang
Xu, Xiangyu
Ma, Lin
Cao, Xiaochun
Meng, Gaofeng
Liu, Wei
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (04) : 1895 - 1908
[49] Rectifying Pseudo Label Learning via Uncertainty Estimation for Domain Adaptive Semantic Segmentation
Zheng, Zhedong
Yang, Yi
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (04) : 1106 - 1120
[50] Rectifying Pseudo Label Learning via Uncertainty Estimation for Domain Adaptive Semantic Segmentation
Zhedong Zheng
Yi Yang
International Journal of Computer Vision, 2021, 129 : 1106 - 1120

← 1 2 3 4 5 →