计算机科学
姿势
人工智能
计算机视觉
卡尔曼滤波器
运动学
滤波器(信号处理)
运动捕捉
噪音(视频)
三维姿态估计
运动(物理)
图像(数学)
物理
经典力学
作者
Enrico Martini,Michele Boldo,Nicola Bombieri
标识
DOI:10.1016/j.sigpro.2024.109598
摘要
There is a growing interest in adopting 3D human pose estimation in safety-critical systems, from healthcare to Industry 5.0. Nevertheless, when applied in such settings, these neural networks may suffer from estimation inaccuracy. Besides imprecise or inconsistent annotations in the training dataset, the inaccuracy is caused by poor image quality, rare poses, dropped frames, or heavy occlusions in the scene. In addition, these scenarios often require the software results to have temporal constraints, such as real-time and zero- or low-latency, which make many of the filtering solutions proposed in the literature inapplicable. This paper proposes FLK, a Filter with Learned Kinematics, to refine 3D human motion data in real-time and at zero/low latency. The temporal core combines a Kalman filter and a low-pass filter, which learns the motion model through a recurrent neural network. The spatial core takes advantage of the biomechanical constraints of the human body to provide spatial coherency between keypoints. The combination of the cores allows the filter to adequately address different types of noise, from jittering to dropped frames. We test the filter on motion data from multiple datasets and seven 3D human pose estimation backbones, improving accuracy up to 140 mm with non-Gaussian noise and 53 mm with missing information.
科研通智能强力驱动
Strongly Powered by AbleSci AI