Panoptic SwiftNet: Pyramidal Fusion for Real-Time Panoptic Segmentation

被引:6
作者
Saric, Josip [1 ]
Orsic, Marin [2 ]
Segvic, Sinisa [1 ]
机构
[1] Univ Zagreb, Fac Elect Engn & Comp, Zagreb 10000, Croatia
[2] Microblink, Zagreb 10000, Croatia
关键词
panoptic segmentation; real-time processing; satellite imagery; deep learning; computer vision; SCENE;
D O I
10.3390/rs15081968
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Dense panoptic prediction is a key ingredient in many existing applications such as autonomous driving, automated warehouses, or remote sensing. Many of these applications require fast inference over large input resolutions on affordable or even embedded hardware. We proposed to achieve this goal by trading off backbone capacity for multi-scale feature extraction. In comparison with contemporaneous approaches to panoptic segmentation, the main novelties of our method are efficient scale-equivariant feature extraction, cross-scale upsampling through pyramidal fusion and boundary-aware learning of pixel-to-instance assignment. The proposed method is very well suited for remote sensing imagery due to the huge number of pixels in typical city-wide and region-wide datasets. We present panoptic experiments on Cityscapes, Vistas, COCO, and the BSB-Aerial dataset. Our models outperformed the state-of-the-art on the BSB-Aerial dataset while being able to process more than a hundred 1MPx images per second on an RTX3090 GPU with FP16 precision and TensorRT optimization.
引用
收藏
页数:18
相关论文
共 47 条
  • [1] A Database and Evaluation Methodology for Optical Flow
    Baker, Simon
    Scharstein, Daniel
    Lewis, J. P.
    Roth, Stefan
    Black, Michael J.
    Szeliski, Richard
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2011, 92 (01) : 1 - 31
  • [2] DISTANCE TRANSFORMATIONS IN DIGITAL IMAGES
    BORGEFORS, G
    [J]. COMPUTER VISION GRAPHICS AND IMAGE PROCESSING, 1986, 34 (03): : 344 - 371
  • [3] COCO-Stuff: Thing and Stuff Classes in Context
    Caesar, Holger
    Uijlings, Jasper
    Ferrari, Vittorio
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1209 - 1218
  • [4] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
    Chen, Liang-Chieh
    Zhu, Yukun
    Papandreou, George
    Schroff, Florian
    Adam, Hartwig
    [J]. COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 833 - 851
  • [5] Cheng B, 2021, ADV NEUR IN, V34
  • [6] Cheng BW, 2020, PROC CVPR IEEE, P12472, DOI 10.1109/CVPR42600.2020.01249
  • [7] The Cityscapes Dataset for Semantic Urban Scene Understanding
    Cordts, Marius
    Omran, Mohamed
    Ramos, Sebastian
    Rehfeld, Timo
    Enzweiler, Markus
    Benenson, Rodrigo
    Franke, Uwe
    Roth, Stefan
    Schiele, Bernt
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3213 - 3223
  • [8] Rethinking Panoptic Segmentation in Remote Sensing: A Hybrid Approach Using Semantic Segmentation and Non-Learning Methods
    de Carvalho, Osmar L. F.
    de Carvalho Junior, Osmar A.
    de Albuquerque, Anesmar O.
    Santana, Nickolas C.
    Borges, Dibio L.
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [9] Panoptic Segmentation Meets Remote Sensing
    de Carvalho, Osmar Luiz Ferreira
    de Carvalho Junior, Osmar Abilio
    Silva, Cristiano Rosa e
    de Albuquerque, Anesmar Olino
    Santana, Nickolas Castro
    Borges, Dibio Leandro
    Gomes, Roberto Arnaldo Trancoso
    Guimaraes, Renato Fontes
    [J]. REMOTE SENSING, 2022, 14 (04)
  • [10] BlitzNet: A Real-Time Deep Network for Scene Understanding
    Dvornik, Nikita
    Shmelkov, Konstantin
    Mairal, Julien
    Schmid, Cordelia
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 4174 - 4182