情绪分析
代表(政治)
计算机科学
自然语言处理
人工智能
政治学
政治
法学
作者
Sicheng Zhao,Zhenhua Yang,Henglin Shi,Xiaocheng Feng,Lingpengkun Meng,Bing Qin,Chenggang Yan,Jianhua Tao,Guiguang Ding
标识
DOI:10.1109/taffc.2025.3539225
摘要
Multimodal sentiment analysis (MSA) aims to leverage the complementary information from multiple modalities for affective understanding of user-generated videos. Existing methods mainly focused on designing sophisticated feature fusion strategies to integrate the separately extracted multimodal representations, ignoring the interference of the information irrelevant to sentiment. In this paper, we propose to disentangle the unimodal representations into sentiment-specific and sentiment-independent features, the former of which are fused for the MSA task. Specifically, we design a novel Sentiment-aware Disentangled Representation Shifting framework, termed SDRS, with two components. Interactive sentiment-aware representation disentanglement aims to extract sentiment-specific feature representations for each nonverbal modality by considering the contextual influence of other modalities with the newly developed cross-attention autoencoder. Attentive cross-modal representation shifting tries to shift the textual representation in a latent token space using the nonverbal sentiment-specific representations after projection. The shifted representation is finally employed to fine-tune a pre-trained language model for multimodal sentiment analysis. Extensive experiments are conducted on three public benchmark datasets, i.e., CMU-MOSI, CMU-MOSEI, and CH-SIMS. The results demonstrate that the proposed SDRS framework not only obtains state-of-the-art results based solely on multimodal labels but also outperforms the methods that additionally require the labels of each modality.
科研通智能强力驱动
Strongly Powered by AbleSci AI