计算机科学
手语
人工智能
自然语言处理
生成语法
语义学(计算机科学)
匹配(统计)
语音识别
模式识别(心理学)
语言学
哲学
数学
统计
程序设计语言
作者
Siyu Liang,Yunan Li,Wentian Xin,Huizhou Chen,Xingchen Liu,Kang Liu,Qiguang Miao
出处
期刊:Sensors
[Multidisciplinary Digital Publishing Institute]
日期:2025-09-24
卷期号:25 (19): 5957-5957
摘要
While sign language combines sequential hand motions with concurrent non-manual cues (e.g., mouth shapes and head tilts), current recognition systems lack multimodal annotation methods capable of capturing their hierarchical semantics. To bridge this gap, we propose GSP-MC, the first method integrating generative large language models into sign language recognition. It leverages retrieval-augmented generation with domain-specific large language models and expert-validated corpora to produce precise multipart descriptions. A dual-encoder architecture bidirectionally aligns hierarchical skeleton features with multi-level text descriptions (global, synonym, part) through probabilistic matching. The approach combines global and part-level losses with KL divergence optimization, ensuring robust alignment across relevant text-skeleton pairs while capturing sign semantics and detailed dynamics. Experiments demonstrate state-of-the-art performance, achieving 97.1% accuracy on the Chinese SLR500 (surpassing SSRL’s 96.9%) and 97.07% on the Turkish AUTSL (exceeding SML’s 96.85%), confirming cross-lingual potential for inclusive communication technologies.
科研通智能强力驱动
Strongly Powered by AbleSci AI