特征(语言学)
计算机科学
情绪分析
模式识别(心理学)
人工智能
哲学
语言学
作者
Yao Fu,Biao Huang,Yujun Wen,Pengzhou Zhang
标识
DOI:10.1016/j.knosys.2024.111965
摘要
Multimodal sentiment analysis (MSA) is crucial as it integrates textual, visual, and audio information from videos to accurately identify human emotional states. This study proposes an innovative multimodal feature decoupling strategy that categorizes sentiment features into common and private features. The private features aim to accurately capture the uniqueness of each modality, thereby increasing feature diversity. In contrast, the common features seek to identify and capture commonalities among different modalities, thus reducing potential information loss during decoupling. To achieve this, we designed dedicated encoders and loss function constraints for both types of features. Additionally, to mitigate information redundancy and prevent key information loss during decoupled representation learning, we introduce a dual feature reconstruction mechanism comprising unimodal feature reconstruction (UFR) and multimodal feature reconstruction (MFR). These mechanisms preserve vital information from the decoupling process and mitigate the impact of redundant data. Our extensive experiments on three datasets demonstrate that our method achieves a significant margin of approximately 1%–3% in accuracy, illustrating that our approach outperforms existing advanced techniques significantly, resulting in noteworthy performance enhancements.
科研通智能强力驱动
Strongly Powered by AbleSci AI