计算机科学
信息抽取
保险丝(电气)
任务(项目管理)
人工智能
模棱两可
编码(集合论)
情报检索
程序设计语言
电气工程
工程类
经济
集合(抽象数据类型)
管理
作者
Bo Xu,Shizhou Huang,Ming Du,Hongya Wang,Hui Song,Yanghua Xiao,Xin Lin
标识
DOI:10.1007/978-3-031-30675-4_40
摘要
Recently, multimodal information extraction has gained increasing attention in social media understanding, as it helps to accomplish the task of information extraction by adding images as auxiliary information to solve the ambiguity problem caused by insufficient semantic information in short texts. Despite their success, current methods do not take full advantage of the information provided by the diverse representations of images. To address this problem, we propose a novel unified visual prompt tuning framework with Mixture-of-Experts to fuse different types of image representations for multimodal information extraction. Extensive experiments conducted on two different multimodal information extraction tasks demonstrate the effectiveness of our method. The source code can be found at https://github.com/xubodhu/VisualPT-MoE .
科研通智能强力驱动
Strongly Powered by AbleSci AI