计算机科学
人类视觉系统模型
人工智能
GSM演进的增强数据速率
领域(数学分析)
遮罩(插图)
质量(理念)
计算机视觉
模式识别(心理学)
图像(数学)
视觉艺术
数学分析
认识论
数学
哲学
艺术
作者
Aoxiang Zhang,Yuan‐Gen Wang,Weixuan Tang,Leida Li,Sam Kwong
标识
DOI:10.1109/tcyb.2023.3338615
摘要
The quality of videos is the primary concern of video service providers. Built upon deep neural networks, video quality assessment (VQA) has rapidly progressed. Although existing works have introduced the knowledge of the human visual system (HVS) into VQA, there are still some limitations that hinder the full exploitation of HVS, including incomplete modeling with few HVS characteristics and insufficient connection among these characteristics. In this article, we present a novel spatial-temporal VQA method termed HVS-5M, wherein we design five modules to simulate five characteristics of HVS and create a bioinspired connection among these modules in a cooperative manner. Specifically, on the side of the spatial domain, the visual saliency module first extracts a saliency map. Then, the content-dependency and the edge masking modules extract the content and edge features, respectively, which are both weighted by the saliency map to highlight those regions that human beings may be interested in. On the other side of the temporal domain, the motion perception module extracts the dynamic temporal features. Besides, the temporal hysteresis module simulates the memory mechanism of human beings and comprehensively evaluates the video quality according to the fusion features from the spatial and temporal domains. Extensive experiments show that our HVS-5M outperforms the state-of-the-art VQA methods. Ablation studies are further conducted to verify the effectiveness of each module toward the proposed method. The source code is available at https://github.com/GZHU-DVL/HVS-5M.
科研通智能强力驱动
Strongly Powered by AbleSci AI