讽刺
计算机科学
模式
人工智能
模态(人机交互)
情绪分析
自然语言处理
机器学习
语言学
讽刺
哲学
社会科学
社会学
作者
Hao Liu,Runguo Wei,Geng Tu,Jiali Lin,Cheng Liu,Dazhi Jiang
标识
DOI:10.1016/j.inffus.2024.102353
摘要
Sarcasm is a form of sentiment expression that highlights the disparity between a person's true intentions and the content they explicitly present. With the exponential increase in multimodal data on social platforms, the detection of sarcasm across various modes has become a pivotal area of research. Although previous studies have extensively examined multimodal feature extraction, fusion, and the modeling of inter-modal incongruities, they often neglected the subtle sentiment cues inherent in sarcastic multimodal data. Additionally, they did not adequately address the sparse distribution and tenuous connections between sarcastic features both within and cross modalities. To address these gaps, we introduce a hierarchical fusion model that integrates sentiment information for enhanced multimodal sarcasm detection. Specifically, we use attribute-object matching in the image modality, treating it as an auxiliary attribute modality. Sentiment data is then extracted from each modality and combined to achieve a more comprehensive representation within modalities. Moreover, we characterize the relationships of inter-modal incongruities using a crossmodal Transformer. We also implement a sentiment-aware image-text contrastive loss mechanism to synchronize the semantics of images and text better. By intensifying these alignments, our model is better equipped to understand incongruous relationships. Experiments demonstrate that our hierarchical fusion model achieves state-of-the-art performance on the multimodal sarcasm detection task.
科研通智能强力驱动
Strongly Powered by AbleSci AI