Applying Transformer-Based Computer Vision Models to Adaptive Bitrate Allocation for 360° Live Streaming

被引：2

作者：

Ao, Alice ^{[1
]}

Park, Sohee ^{[1
]}

机构：

[1] Yale Univ, Comp Sci, New Haven, CT 06520 USA

来源：

2024 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC 2024 | 2024年

关键词：

360 degrees Video; 360 degrees Live Streaming; Adaptive Streaming; Transformers; Machine Learning;

D O I：

10.1109/WCNC57260.2024.10571028

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Despite the heightened popularity of virtual reality (VR) and 360 degrees video, 360 degrees content remains expensive and difficult to stream. 360 degrees live streaming is especially challenging, as it requires high bandwidth and low latency to avoid quality and motion-sickness issues. This paper explores how adaptive bitrate allocation, in which only the user's predicted viewport is streamed in high quality, and the rest of the view is streamed in low quality, can lead to increases in viewport quality. Transformer-based saliency models pre-trained on 2D images are used for viewport prediction. Key contributions include 1) determining whether transformer-based models for 2D images are effective for saliency detection of 360 degrees content 2) examining viewport prediction accuracy of saliency-only models and 3) a novel bitrate allocation algorithm. Empirical results demonstrate that even without access to head-movement data or fine-tuning, these models lead to increased quality in a user's perceived viewport over traditional non-adaptive streaming.

引用

页数：6