TCTR: Text-Guided Contrastive Learning with Token-Level Reconstruction Network for missing modalities in multimodal sentiment analysis

计算机科学安全性令牌模式人工智能自然语言处理情绪分析社会科学计算机安全社会学

作者

Zhihao Yang,Qing He,MingHao Yu,Nisuo Du,Yijie Lu

出处

期刊：Information Fusion [Elsevier BV]
日期：2025-08-05 卷期号：126: 103571-103571 被引量：8

标识

DOI：10.1016/j.inffus.2025.103571

摘要

Multimodal sentiment analysis (MSA) tasks in incomplete multimodal data scenarios must account for random missing or noisy interference of modality information, aiming to perform robust sentiment analysis on multimodal data. This also reflects the trend of MSA tasks transitioning from idealized laboratory settings to real-world conditions, making it a current research hotspot in multimodal learning. However, existing studies still face limitations in missing modeling analysis, and lacking effective modeling of missing scenarios. Moreover, current methods primarily focus on completing missing modality features in the feature space, overlooking information supplementation in the semantic space, which is crucial for multimodal sentiment analysis tasks. To address this, we propose a text-guided fine-grained network model: Text-Guided Contrastive Learning with Token-Level Reconstruction Network (TCTR). This is motivated by the fact that the text modality typically contains more direct and complete sentiment information. In TCTR, we first design the Token-level Missing Inspection (TMI) module to perform token-level missing modeling on the guided modality, addressing the limitation of insufficient capture of critical sentiment information in missing inspection through fine-grained missing analysis. Subsequently, in the Semantic Contrastive Learning for Missing Modality Supplementation (SCL-MMS) module, we leverage constructed negative sample labels to jointly complete missing sentiment information from both the feature space and the semantic space, mitigating the issue of inadequate supplementation quality caused by relying solely on the feature space in existing methods. Finally, building on prior research, we perform interaction and fusion of multimodal features to enable sentiment polarity prediction. Through performance comparisons with state-of-the-art methods and ablation studies on various datasets, the experimental results demonstrate that TCTR achieves superior sentiment polarity prediction across different modality-missing scenarios, effectively enhancing the robustness of MSA tasks in such conditions. • MSA tasks under missing modality scenarios. • Missing modality inspection and sentiment supplementation. • Token-level missingness inspection and fine-grained missingness modeling. • Missing modality supplementation guided by label semantics and feature space. • The effectiveness of TCTR is validated through extensive experiments.

求助该文献

最长约 10秒，即可获得该文献文件

TCTR: Text-Guided Contrastive Learning with Token-Level Reconstruction Network for missing modalities in multimodal sentiment analysis

今日热心研友