HUMAN-MACHINE COLLABORATIVE VIDEO CODING THROUGH CUBOIDAL PARTITIONING

被引:5
作者
Ahmmed, Ashek [1 ,4 ]
Paul, Manoranjan [1 ]
Murshed, Manzur [2 ]
Taubman, David [3 ]
机构
[1] Charles Sturt Univ, Sch Comp & Math, Bathurst, NSW, Australia
[2] Federat Univ, Sch Sci Engn & Informat Technol, Ballarat, Vic, Australia
[3] Univ New South Wales, Sch Elect Engn & Telecommun, Kensington, NSW, Australia
[4] Univ New South Wales, Sch Engn & Informat Technol, Kensington, NSW, Australia
来源
2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP) | 2021年
关键词
Cuboid; HEVC; VCM; Object detection;
D O I
10.1109/ICIP42928.2021.9506150
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video coding algorithms encode and decode an entire video frame while feature coding techniques only preserve and communicate the most critical information needed for a given application. This is because video coding targets human perception, while feature coding aims for machine vision tasks. Recently, attempts are being made to bridge the gap between these two domains. In this work, we propose a video coding framework by leveraging on to the commonality that exists between human vision and machine vision applications using cuboids. This is because cuboids, estimated rectangular regions over a video frame, are computationally efficient, has a compact representation and object centric. Such properties are already shown to add value to traditional video coding systems. Herein cuboidal feature descriptors are extracted from the current frame and then employed for accomplishing a machine vision task in the form of object detection. Experimental results show that a trained classifier yields superior average precision when equipped with cuboidal features oriented representation of the current test frame. Additionally, this representation costs 7% less in bit rate if the captured frames are need be communicated to a receiver.
引用
收藏
页码:2074 / 2078
页数:5
相关论文
共 21 条
  • [1] Ahmmed A., 2021, ICIP, P2021
  • [2] Ahmmed A., 2020, IEEE INT WORKSH MULT, P1, DOI DOI 10.1109/mmsp48831.2020.9287138
  • [3] DYNAMIC POINT CLOUD COMPRESSION USING A CUBOID ORIENTED DISCRETE COSINE BASED MOTION MODEL
    Ahmmed, Ashek
    Paul, Manoranjan
    Murshed, Manzur
    Taubman, David
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 1935 - 1939
  • [4] Ahmmed A, 2020, INT CONF ACOUST SPEE, P2188, DOI [10.1109/ICASSP40776.2020.9053851, 10.1109/icassp40776.2020.9053851]
  • [5] Albawi S, 2017, I C ENG TECHNOL
  • [6] Bellver M., 2016, HIERARCHICAL OBJECT
  • [7] Compact Descriptors for Video Analysis: The Emerging MPEG Standard
    Duan, Ling-Yu
    Chandrasekhar, Vijay
    Wang, Shiqi
    Lou, Yihang
    Lin, Jie
    Bai, Yan
    Huang, Tiejun
    Kot, Alex Chichung
    Gao, Wen
    [J]. IEEE MULTIMEDIA, 2019, 26 (02) : 44 - 54
  • [8] Overview of the MPEG CDVS standard
    Duan, Ling-Yu
    Huang, Tiejun
    Gao, Wen
    [J]. 2015 DATA COMPRESSION CONFERENCE (DCC), 2015, : 323 - 332
  • [9] Video Coding for Machines: A Paradigm of Collaborative Compression and Intelligent Analytics
    Duan, Lingyu
    Liu, Jiaying
    Yang, Wenhan
    Huang, Tiejun
    Gao, Wen
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 8680 - 8695
  • [10] Multi-Scale Deep Reinforcement Learning for Real-Time 3D-Landmark Detection in CT Scans
    Ghesu, Florin-Cristian
    Georgescu, Bogdan
    Zheng, Yefeng
    Grbic, Sasa
    Maier, Andreas
    Hornegger, Joachim
    Comaniciu, Dorin
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (01) : 176 - 189