Applying Transformer-Based Computer Vision Models to Adaptive Bitrate Allocation for 360° Live Streaming

被引：2

作者：

Ao, Alice ^{[1
]}

Park, Sohee ^{[1
]}

机构：

[1] Yale Univ, Comp Sci, New Haven, CT 06520 USA

来源：

2024 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC 2024 | 2024年

关键词：

360 degrees Video; 360 degrees Live Streaming; Adaptive Streaming; Transformers; Machine Learning;

D O I：

10.1109/WCNC57260.2024.10571028

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Despite the heightened popularity of virtual reality (VR) and 360 degrees video, 360 degrees content remains expensive and difficult to stream. 360 degrees live streaming is especially challenging, as it requires high bandwidth and low latency to avoid quality and motion-sickness issues. This paper explores how adaptive bitrate allocation, in which only the user's predicted viewport is streamed in high quality, and the rest of the view is streamed in low quality, can lead to increases in viewport quality. Transformer-based saliency models pre-trained on 2D images are used for viewport prediction. Key contributions include 1) determining whether transformer-based models for 2D images are effective for saliency detection of 360 degrees content 2) examining viewport prediction accuracy of saliency-only models and 3) a novel bitrate allocation algorithm. Empirical results demonstrate that even without access to head-movement data or fine-tuning, these models lead to increased quality in a user's perceived viewport over traditional non-adaptive streaming.

引用

页数：6

共 31 条

[1] EFFICIENT PER-SHOT TRANSFORMER-BASED BITRATE LADDER PREDICTION FOR ADAPTIVE VIDEO STREAMING
Telili, Ahmed
Hamidouche, Wassim
Fezza, Sid Ahmed
Morin, Luce
2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1835 - 1839
[2] Hybrid-360: An adaptive bitrate algorithm for tile-based 360 video streaming
Yang, Shujie
Hu, Jialu
Jiang, Ke
Xiao, Han
Wang, Mu
TRANSACTIONS ON EMERGING TELECOMMUNICATIONS TECHNOLOGIES, 2022, 33 (04)
[3] Strawberry disease identification with vision transformer-based models
Nguyen, Hai Thanh
Tran, Tri Dac
Nguyen, Thanh Tuong
Pham, Nhi Minh
Nguyen Ly, Phuc Hoang
Luong, Huong Hoang
MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (29) : 73101 - 73126
[4] Computer Vision-Based Monitoring of Construction Site Housekeeping: An Evaluation of CNN and Transformer-Based Models
Shao, Zherui
Goh, Yang Miang
Tian, Jing
Lim, Yu Guang
Gan, Vincent Jie Long
COMPUTING IN CIVIL ENGINEERING 2023-RESILIENCE, SAFETY, AND SUSTAINABILITY, 2024, : 508 - 515
[5] Securing Tiny Transformer-based Computer Vision Models: Evaluating Real-World Patch Attacks
Mattei, Andrea
Scherer, Moritz
Cioflan, Cristian
Magno, Michele
Benini, Luca
2023 IEEE 9TH WORLD FORUM ON INTERNET OF THINGS, WF-IOT, 2023,
[6] QoE optimization based on Adaptive Bitrate Control for Multi-party Interactive Live Streaming
Wang, Jiajun
Gao, Yongqiang
PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 291 - 296
[7] An FPGA-based Efficient Streaming Vector Processing Engine for Transformer-based Models
He, Zicheng
Zhao, Tiandong
Miao, Siyuan
Wu, Chen
He, Lei
2024 INTERNATIONAL SYMPOSIUM OF ELECTRONICS DESIGN AUTOMATION, ISEDA 2024, 2024, : 722 - 727
[8] Performance Comparison of Vision Transformer-Based Models in Medical Image Classification
Kanca, Elif
Ayas, Selen
Kablan, Elif Baykal
Ekinci, Murat
2023 31ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU, 2023,
[9] VISION TRANSFORMER-BASED RETINA VESSEL SEGMENTATION WITH DEEP ADAPTIVE GAMMA CORRECTION
Yu, Hyunwoo
Shim, Jae-hun
Kwak, Jaeho
Song, Jou Won
Kang, Suk-Ju
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 1456 - 1460
[10] Streaming Transformer-based Acoustic Models Using Self-attention with Augmented Memory
Wu, Chunyang
Wang, Yongqiang
Shi, Yangyang
Yeh, Ching-Feng
Zhang, Frank
INTERSPEECH 2020, 2020, : 2132 - 2136

← 1 2 3 4 →