清晨好,您是今天最早来到科研通的研友!由于当前在线用户较少,发布求助请尽量完整地填写文献信息,科研通机器人24小时在线,伴您科研之路漫漫前行!

MMCPose: Multimodal Condition-Driven 3D Human Pose Estimation Via Diffusion Models

计算机科学 人工智能 姿势 关系(数据库) 单眼 计算机视觉 块(置换群论) 噪音(视频) 感知 三维姿态估计 机器学习 深度学习 模式识别(心理学) 功率(物理) 关节式人体姿态估计 特征(语言学) 降噪 特征学习 还原(数学) 接头(建筑物) 机制(生物学) 判别式 监督学习 可视化 方案(数学) 数据建模 实体造型 帧(网络) 重点(电信) 对比度(视觉)
作者
Xixia Xu,Jiamao Li
出处
期刊:IEEE Transactions on Multimedia [Institute of Electrical and Electronics Engineers]
卷期号:: 1-11
标识
DOI:10.1109/tmm.2026.3654424
摘要

Nowadays, diffusion-based methods for monocular 3D human pose estimation (3D HPE) have achieved state-of-the-art performance by directly regressing the 3D joint coordinates from the 2D observations. Although some methods incorporated the human body prior to improve the denoising quality, the absense of the structural relation and pose-aware guidance make these models prone to generating unreasonable poses. The challenge is noticeable in complex conditions such as occlusions and crowded scenarios. To alleviate this, we present MMCPose, a novel Multi-modal Condition-driven 3D HPE framework via diffusion models that capitalizes on the benefits of the multi-modal conditioning input. Specifically, we propose Multi-modal Condition Learning (MCL) strategy to incorporate multi-modal conditions such as joint- wise relation, part-aware prompt and pose-aware mask to improve the generation quality. The MCL block consists of (i) Joint- wise Relation Condition Learning (JRCL) models the flexible joint- wise relation via GCN to mitigate disturbances arising from confused joints. (ii) Part-aware Prompt Condition Learning (PPCL) constructs multi-granular prompts via accessible texts and feasible knowledge of body parts with learnable prompts to model implicit textual guidance. (iii) Pose-aware Mask Condition Learning (PMCL) designs a pose-specific mask to increase the model's emphasis to the pose region, augmenting the precision in capturing intricate pose details. Furthermore, we explore a multi-modal condition-pose interaction learning (MCPI) mechanism to establish interaction between the learned multi-modal conditions and poses to maximize the power of condition effect. This method fully unleashes the perceptual capability of the multi-modal conditions in diffusion-based 3D HPE. Extensive evaluations conducted on two popular benchmarks (e.g., Human3.6 M, MPI-INF-3DHP) and achieve new state-of-the-art performance.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
8秒前
虚幻馒头发布了新的文献求助10
12秒前
彭博完成签到,获得积分10
25秒前
小二郎应助有魅力的千萍采纳,获得10
38秒前
和谐的夏岚完成签到 ,获得积分10
1分钟前
顾矜应助有魅力的千萍采纳,获得10
1分钟前
xiaoqingnian完成签到,获得积分10
1分钟前
研友_nxw2xL完成签到,获得积分10
2分钟前
zxx完成签到 ,获得积分10
2分钟前
无花果应助科研通管家采纳,获得30
2分钟前
如歌完成签到,获得积分10
2分钟前
2分钟前
2分钟前
2分钟前
2分钟前
2分钟前
2分钟前
2分钟前
2分钟前
2分钟前
2分钟前
2分钟前
2分钟前
虚幻馒头完成签到,获得积分10
2分钟前
2分钟前
2分钟前
2分钟前
2分钟前
2分钟前
2分钟前
2分钟前
2分钟前
2分钟前
2分钟前
2分钟前
虚幻馒头发布了新的文献求助10
2分钟前
2分钟前
3分钟前
拼搏的不评完成签到,获得积分10
3分钟前
vbnn完成签到 ,获得积分10
3分钟前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
晶种分解过程与铝酸钠溶液混合强度关系的探讨 8888
Chemistry and Physics of Carbon Volume 18 800
The Organometallic Chemistry of the Transition Metals 800
Leading Academic-Practice Partnerships in Nursing and Healthcare: A Paradigm for Change 800
The formation of Australian attitudes towards China, 1918-1941 640
Signals, Systems, and Signal Processing 610
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6427415
求助须知:如何正确求助?哪些是违规求助? 8244446
关于积分的说明 17527908
捐赠科研通 5482732
什么是DOI,文献DOI怎么找? 2895013
邀请新用户注册赠送积分活动 1871139
关于科研通互助平台的介绍 1709911