计算机科学
瓶颈
姿势
编码器
卷积神经网络
软件部署
机器学习
人工智能
数据挖掘
嵌入式系统
操作系统
作者
Wei Liang,Cheng Zhang,Yanxia Wang,Junjia Han
出处
期刊:Research Square - Research Square
日期:2023-11-03
标识
DOI:10.21203/rs.3.rs-3534285/v1
摘要
Abstract The current methods for human pose estimation focus on improving the accuracy of prediction results, but they overlook the significant issues of computational cost and large number of parameters in practical deployment.Although some lightweight pose estimation models have successfully in reducing the number of parameters, lightweight models typically employ smaller convolutional kernel to reduce the model size, leading to insufficient capture of contextual information.To address this issue, this paper constructs a lightweight network model EfficientPose.Specifically, to expand the receptive field and acquire richer feature information without increasing computational costs, this paper proposes the Efficient Bottleneck Block (EBB) module.Additionally, to capture global spatial dependencies and enhance the representation capability of low-resolution features, a Transformer encoder is introduced into the model.Meanwhile, to overcome the issue of excessively long training time for lightweight models, a novel iterative training strategy is proposed to fully unleash the potential of EfficientPose.To validate the effectiveness of EfficientPose model, extensive comparative experiments and ablation studies are conducted in this paper.Compared with HRNet-W48, which has the same backbone network, EfficientPose not only reduces the number of parameters by 72\% when the input image size is the same but also improves the accuracy by 0.8 and 0.9 percentage points in the validation and test sets of COCO, respectively.Experiments show that the EfficientPose model can maintain high accuracy even with a significant reduction in the number of parameters.This provides the potential for further application in real-world scenarios with limited resources.
科研通智能强力驱动
Strongly Powered by AbleSci AI