视区
计算机科学
编码器
GSM演进的增强数据速率
视频流媒体
视频跟踪
视频质量
人工智能
计算机视觉
实时计算
视频处理
运营管理
操作系统
经济
公制(单位)
作者
Zipeng Pan,Zhang Yuan,Tao Lin,Jinyao Yan
标识
DOI:10.1145/3609395.3610597
摘要
Viewport prediction plays a crucial role in live 360° video streaming as it determines which tiles should be prefetched in high quality, thereby significantly impacting the user experience. However, the current approach to viewport prediction, which integrates content-level visual features with the viewer's head movement trajectory, faces the challenge of striking a balance between prediction accuracy and computational complexity. In this paper, we propose LiveAE, a novel attention-based and edge-assisted viewport prediction framework for live 360° video streaming. Specifically, we employ a pre-trained video encoder called Vision Transformer (ViT) for general visual feature extraction and a cross-attention mechanism for user-specific interest tracking. To address the computational complexity issue, we offload the aforementioned content-level operations to an edge server while retaining trajectory-related functions on the client side. Extensive experiments show that our proposed method not only outperforms state-of-the-art algorithms but also ensures the real-time requirements of live 360° video streaming.
科研通智能强力驱动
Strongly Powered by AbleSci AI