观点
计算机科学
钥匙(锁)
交叉口(航空)
卷积(计算机科学)
人工智能
视图合成
计算机视觉
虚拟现实
图像(数学)
移动设备
传输(计算)
多媒体
人机交互
情报检索
万维网
渲染(计算机图形)
艺术
计算机安全
工程类
人工神经网络
航空航天工程
并行计算
视觉艺术
作者
Pengyu Zhao,Yuanxing Zhang,Kaigui Bian,Hu Tuo,Lingyang Song
标识
DOI:10.1109/icassp.2019.8682776
摘要
In the past few years, virtual reality (VR) has become an enabling technique, not only for enriching our visual experience but also for providing new channels for businesses. Untethered mobile devices are the main players for watching 360-degree content, thereby the precision of predicting the future viewpoints is one key challenge to improve the quality of the playbacks. In this paper, we investigate the image features of the 360-degree videos and the contextual information of the viewpoint trajectories. Specifically, we design ladder convolution to adapt for the distorted image, and propose LadderNet to transfer the knowledge from the pre-trained model and retrieve the features from the distorted image. We then combine the image features and the contextual viewpoints as the inputs for long short-term memory (LSTM) to predict the future viewpoints. Our approach is compared with several state-of-the-art viewpoint prediction algorithms over two 360-degree video datasets. Results show that our approach can improve the Intersection over Union (IoU) by at least 5% and meeting the requirements of the playback of 360-degree video on mobile devices.
科研通智能强力驱动
Strongly Powered by AbleSci AI