姿势
计算机科学
编码(集合论)
代表(政治)
人工智能
偏移量(计算机科学)
分类
计算机视觉
模式识别(心理学)
政治
政治学
法学
集合(抽象数据类型)
程序设计语言
作者
Tao Wang,Lei Jin,Z. Jane Wang,Xiaojin Fan,Yu Cheng,Yinglei Teng,Junliang Xing,Jian Zhao
标识
DOI:10.1145/3581783.3611989
摘要
Multi-person pose estimation in crowded scenes remains a very challenging task. This paper finds that most previous methods fail to estimate or group visible keypoints in crowded scenes rather than reasoning invisible keypoints. We thus categorize the crowded scenes into entanglement and occlusion based on the visibility of human parts and observe that entanglement is a significant problem in crowded scenes. With this observation, we propose DecenterNet, an end-to-end deep architecture to perform robust and efficient pose estimation in crowded scenes. Within DecenterNet, we introduce a decentralized pose representation that uses all visible keypoints as the root points to represent human poses, which is more robust in the entanglement area. We also propose a decoupled pose assessment mechanism, which introduces a location map to adaptively select optimal poses in the offset map. In addition, we have constructed a new dataset named SkatingPose, containing more entangled scenes. The proposed DecenterNet surpasses the best method on SkatingPose by 1.8 AP. Furthermore, DecenterNet obtains 71.2 AP and 71.4 AP on the COCO and CrowdPose datasets, respectively, demonstrating the superiority of our method. We will release our source code, trained models, and dataset to facilitate further studies in this research direction. Our code and dataset are available in https://github.com/InvertedForest/DecenterNet.
科研通智能强力驱动
Strongly Powered by AbleSci AI