计算机科学
解耦(概率)
情绪识别
特征(语言学)
人工智能
情感计算
语音识别
融合
模式识别(心理学)
特征提取
传感器融合
工程类
语言学
控制工程
哲学
作者
Hongxiang Gao,Zhipeng Cai,Xingyao Wang,Min Wu,Chengyu Liu
标识
DOI:10.1109/jbhi.2025.3597398
摘要
Multimodal emotion recognition has emerged as a promising direction for capturing the complexity of human affective states by integrating physiological and behavioral signals. However, challenges remain in addressing feature redundancy, modality heterogeneity, and insufficient inter-modal supervision. In this paper, we propose a novel Multimodal Disentangled Knowledge Distillation framework that explicitly disentangles modality-shared and modality-specific features and enhances cross-modal knowledge transfer via a graph-based distillation module. Specifically, we introduce a dual-stream representation learning architecture that separates common and unique subspaces across modalities. To facilitate effective information interaction, we design a directed and learnable modality graph, where each edge represents the semantic transfer strength from one modality to another. We validate our method on two benchmark datasets-MAHNOB-HCI and DEAP-for both regression and classification tasks, under subject-dependent and subject-independent protocols. Experimental results demonstrate that our method achieves state-of-the-art performance, with statistical significance confirmed by paired two-tailed $t$-tests. In addition, qualitative analysis of the learned modality graph and t-SNE embeddings further illustrates the effectiveness of our feature disentanglement and dynamic knowledge transfer design. This work offers a unified, interpretable, and robust framework for multimodal emotion understanding and lays the foundation for affective computing in real-world human-machine interaction scenarios.
科研通智能强力驱动
Strongly Powered by AbleSci AI