计算机科学
信息瓶颈法
瓶颈
人工智能
关系(数据库)
特征提取
图形
水准点(测量)
任务(项目管理)
模式识别(心理学)
图像(数学)
机器学习
关系抽取
信息抽取
数据挖掘
聚类分析
理论计算机科学
管理
经济
地理
嵌入式系统
大地测量学
作者
Shengqiong Wu,Hao Fei,Yixin Cao,Lidong Bing,Tat‐Seng Chua
标识
DOI:10.18653/v1/2023.acl-long.823
摘要
Existing research on multimodal relation extraction (MRE) faces two co-existing challenges, internal-information over-utilization and external-information under-exploitation. To combat that, we propose a novel framework that simultaneously implements the idea of internal-information screening and external-information exploiting. First, we represent the fine-grained semantic structures of the input image and text with the visual and textual scene graphs, which are further fused into a unified cross-modal graph (CMG). Based on CMG, we perform structure refinement with the guidance of the graph information bottleneck principle, actively denoising the less-informative features. Next, we perform topic modeling over the input image and text, incorporating latent multimodal topic features to enrich the contexts. On the benchmark MRE dataset, our system outperforms the current best model significantly. With further in-depth analyses, we reveal the great potential of our method for the MRE task.
科研通智能强力驱动
Strongly Powered by AbleSci AI