计算机科学
编码(社会科学)
工作流程
人工智能
编码树单元
算法
机器学习
数学
统计
解码方法
数据库
作者
Zheng Liu,Tianyi Li,Ying Chen,Kaijin Wei,Mai Xu,Honggang Qi
标识
DOI:10.1109/tcsvt.2023.3262733
摘要
The latest Versatile Video Coding (VVC) standard has significantly coding efficiency improvement compared with its ancestor High Efficiency Video Coding (HEVC) standard, but at the expense of over-high complexity. As measured by the VVC test model (VTM), the intra-mode comparison and selection in the rate-distortion optimization (RDO) search consume most of the encoding time. In this paper, we propose a deep multi-task learning based fast intra-mode decision approach via adaptively pruning off most redundant modes. First, we create a large-scale intra-mode database for VVC, including both normal angular modes and the newly introduced tools, i.e., intra sub-partition (ISP) and matrix-based intra prediction (MIP). Next, we propose a multi-task intra-mode decision network (MID-Net) model to effectively predict the most probable angular modes and whether to skip ISP and MIP modes. Then, a fast intra-coding workflow is designed accordingly, involving rough mode decision (RMD) acceleration and candidate mode list (CML) pruning. For the workflow output, the learning-oriented probability and the statistics-oriented probability are synthesized together to further improve the prediction accuracy, ensuring that only unnecessary intra-modes are skipped. Finally, experimental results show that our approach can significantly reduce 40.48% of encoding time of VVC intra-coding with negligible rate-distortion degradation, outperforming other state-of-the-art approaches.
科研通智能强力驱动
Strongly Powered by AbleSci AI