计算机科学
图像(数学)
人工智能
自然语言处理
模式识别(心理学)
作者
Xiaoyu Zhao,Buzhou Tang
标识
DOI:10.1007/978-3-030-92185-9_39
摘要
Multimodal named entity recognition (MNER) that recognizes named entities in text with the help of images has become a popular topic in recent years. Previous studies on MNER only utilize visual features or detected concepts from a given image directly without considering implicit knowledge among visual concepts. Taking the concepts not detected but relevant to those in the image into consideration provides rich prior knowledge, which has been proved effective on other multimodal tasks. This paper proposes a novel method to effectively take full advantage of external implicit knowledge, called Co-attention-based model with Dynamic Visual Concept Expansion (CDVCE). In CDVCE, we adopt the concept co-occurrence matrix in a large-scale annotated image database as implicit knowledge among visual concepts and dynamically expand detected visual concepts conditioned on the concept co-occurrence matrix and the input text. Experiments conducted on two public MNER datasets prove the effectiveness of our proposed method, which outperforms other state-of-the-art methods in most cases.
科研通智能强力驱动
Strongly Powered by AbleSci AI