异常检测
计算机科学
人工智能
模式识别(心理学)
计算机视觉
图像处理
图像(数学)
作者
Jiaxu Leng,Yumeng Zhang,Mingpi Tan,Changjiang Kuang,Zhanjie Wu,Ji Gan,Xinbo Gao
标识
DOI:10.1109/tip.2025.3614006
摘要
Conventional reconstruction-based video anomaly detection (VAD) methods implicitly model normality in latent spaces, which is limited by the generalization ability of latent features. Normalizing Flow (NF)-based methods have been introduced to address this issue, as they explicitly model the distribution of input data and achieve significant performance in VAD. However, existing NF-based methods are confined to Euclidean space, limiting their ability to model action hierarchies. While effective at capturing local joint dynamics and short-term temporal variations, they fail to encode kinematic dependencies and long-term pose evolution, ultimately struggling to discern ambiguous anomalies that deviate minimally from normal motion. In contrast, hyperbolic representation learning, with its ability to model hierarchical and complex relationships among actions, offers a promising solution to enhance the discriminative power between similar skeletal actions. Motivated by this, we propose a novel Dual-Space Normalizing Flow (DSNF) method. Specifically, we design a Dual-Space Parallel Graph Convolutional Network (DSPGCN) that synergistically integrates the strengths of both Euclidean and hyperbolic geometries to simultaneously capture local detail features of poses and intrinsic hierarchical relationships of actions. To enhance the model's focus on discriminative features, we design an Adaptive Weighted Approximation Mass (AWAM) loss that dynamically adjusts weights to impose stronger constraints on regions with low discriminability in the dual space, encouraging the model to focus more on key discriminative features in hyperbolic space that reflect complex relationships between actions. Extensive experiments on public datasets demonstrate the effectiveness and robustness of our method in various VAD scenarios.
科研通智能强力驱动
Strongly Powered by AbleSci AI