计算机科学
情态动词
多模态
人工智能
人机交互
万维网
化学
高分子化学
作者
Yan Bai,Yanfeng Liu,Yongjun Li
标识
DOI:10.1109/tcss.2024.3415160
摘要
Recently, fake news detection (FND) is an essential task in the field of social network analysis, and multimodal detection methods that combine text and image have been significantly explored in the last five years. However, the physical features of images that can be clearly shown in the frequency level are often ignored, and thus cross-modal feature extraction and interaction still remain a great challenge when the frequency domain is introduced for multimodal FND. To address this issue, we propose a frequency-aware cross-modal interaction network (FCINet) for multimodal FND in this article. First, a triple-branch encoder with robust feature extraction capacity is proposed to explore the representation of frequency, spatial, and text domains, separately. Then, we design a parallel cross-modal interaction strategy to fully exploit the interdependencies among them to facilitate multimodal FND. Finally, a combined loss function including deep auxiliary supervision and event classification is introduced to improve the generalization ability for multitask training. Extensive experiments and visual analysis on two public real-world multimodal fake news datasets show that the presented FCINet obtains excellent performance and exceeds numerous state-of-the-art methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI