作者
Zhaoyu Shou,Hang Liu,Xiaobu Xu,Dongxu Li,ZIYONG WU,Juhua Huang,Hongbin Liao
摘要
Purpose Occluded human pose estimation presents a substantial difficulty in the domains of smart classrooms and big data analysis, profoundly influencing the advancement of intelligent educational systems. The swift progress of AI-driven education has led to a growing necessity for precise human pose keypoint recognition technology in actual classroom environments. Nonetheless, occlusions resulting from student interactions and physical obstructions like desks offer significant hurdles for conventional pose estimation methods, frequently resulting in diminished reliability and performance. This study introduces a novel framework, YOLOXSP, designed to enhance the precision of keypoint recognition and facilitate the advancement of intelligent education systems. Design/methodology/approach The primary applications of this research lie in AI-powered education systems and large-scale classroom data analytics. The main objective is to develop an occluded pose estimation framework that accurately detects human keypoints under occlusion. This framework addresses the limitations of existing methods in handling occluded poses, offering more reliable solutions for intelligent educational environments. The YOLOXSP model includes a shared-separation dual-channel detection head (ShareSepHead) that separates spatial and channel features to improve detection accuracy. It further incorporates a novel sparse cyclic dual-channel attention mechanism (NS-CBAM), which selectively strengthens feature representations in occluded regions via multi-stage attention. Additionally, an anti-occlusion loss (Aol) is designed to adaptively penalize keypoint errors under occlusion, significantly improving the model's robustness in smart classroom settings. Findings Experimental results show that YOLOXSP, built upon the YOLOv8x-POSE baseline, outperforms existing methods, achieving 73.2\% keypoint mAP on the heavily occluded public benchmark OCHuman, 91.9\% keypoint mAP on the dense real-world classroom dataset GUET-POSE, and 75.8\% keypoint mAP on the GUET-Occluded-POSE dataset tailored for realistic occlusion scenarios. These results highlight the model's enhanced robustness in handling occluded keypoints and demonstrate its practical utility in smart classroom applications. Research limitations/implications This study focuses primarily on student posture estimation in smart classrooms. Future research could extend the model to broader human pose estimation applications in more diverse educational settings. Practical implications The proposed method contributes to more accurate student posture analysis, facilitating intelligent learning state assessment and engagement evaluation in smart classrooms. Social implications The study contributes to the broader adoption of AI-driven education by improving automated student posture analysis. Enhanced engagement detection can help educators develop personalized learning strategies, fostering a more inclusive and effective educational environment. The ability to assess learning behaviors in real-time promotes adaptive teaching approaches, benefiting both in-person and remote learning. Furthermore, the dataset and model can be extended to other educational and behavioral research fields, supporting innovations in human-computer interaction, cognitive analysis, and assistive technologies. By improving occlusion-resistant pose estimation, this research enhances AI's role in educational accessibility and student well-being. Originality/value By designing ShareSepHead, NS-CBAM, and the Aol function, this study advances human pose estimation in occluded and dense keypoint scenarios. The YOLOXSP framework's innovative design enhances pose detection accuracy in smart classrooms, making it highly suitable for AI-powered educational systems. This approach has the potential to further advance intelligent education and related fields.