计算机科学
情态动词
人工智能
蒸馏
情绪识别
模式识别(心理学)
特征提取
机器学习
人机交互
化学
有机化学
高分子化学
作者
Teng Sun,Yinwei Wei,Juntong Ni,Zixin Liu,Xuemeng Song,Yaowei Wang,Liqiang Nie
标识
DOI:10.1109/tmm.2024.3385180
摘要
Due to its wide applications, multimodal emotion recognition has gained increasing research attention. Although existing methods have achieved compelling success with various multimodal fusion methods, they overlook that the dominated modality (e.g., text) may cause a shortcut and hence negatively affect the representation learning of other modalities (e.g., image and audio). To alleviate such a problem, we resort to the knowledge distillation to narrow the gap between different modalities. In particular, we develop a new hierarchical knowledge distillation model for multi-modal emotion recognition (HKD-MER), consisting of three components, feature extraction, hierarchical knowledge distillation, and attentive multi-modal fusion. As the major contribution in our proposed model, the hierarchical knowledge distillation is designed to transfer the knowledge from the dominant modality to the others at both the feature and label levels. It boosts the performance of non-dominated modalities by modeling the inter-modal relation between different modalities. We have justified the effectiveness of our proposed model over two benchmark datasets.
科研通智能强力驱动
Strongly Powered by AbleSci AI