计算机科学
人工智能
篮球
姿势
稳健性(进化)
背景(考古学)
计算机视觉
特征(语言学)
模式识别(心理学)
帧(网络)
机器学习
电信
生物
历史
生物化学
基因
哲学
古生物学
语言学
考古
化学
作者
Zhihao Zhang,Wenyue Liu,Zheng Yuan,Linkang Du,Lezhong Sun
标识
DOI:10.1038/s41598-025-14985-y
摘要
Accurate athlete pose estimation in basketball is crucial for game analysis, player training, and tactical decision-making. However, existing pose estimation methods struggle to effectively address common challenges in basketball, such as motion blur, occlusions, and complex backgrounds. To tackle these issues, this paper proposes a basketball action pose estimation framework, which first leverages a multi-dimensional data stream network to extract spatial, temporal, and contextual information separately. Specifically, the spatial stream branch aims to extract multi-scale features and captures the spatial pose information of players in single-frame images through feature fusion and spatial attention mechanisms. The temporal stream branch merges feature maps with adjacent frames, effectively capturing player motion information across consecutive frames. The context stream branch generates a global context feature vector that encodes the entire image, offering a holistic perspective for pose estimation. Subsequently, we designed a feature fusion module that integrates early fusion, late fusion, and hybrid fusion strategies to fully utilize multi-modal information. Finally, we introduced a stage-wise streaming training module that progressively enhances the model's accuracy and generalization ability through three stages. Experimental results demonstrate that the proposed framework significantly improves the accuracy and robustness of basketball action pose estimation, particularly excelling in scenarios with high dynamics and complex backgrounds.
科研通智能强力驱动
Strongly Powered by AbleSci AI