A Unified MRC Framework for Named Entity Recognition

命名实体识别 计算机科学 任务(项目管理) 答疑 实体链接 人工智能 自然语言处理 安全性令牌 序列(生物学) 过程(计算) 序列标记 程序设计语言 知识库 经济 管理 生物 遗传学 计算机安全
作者
Xiaoya Li,Jingrong Feng,Yuxian Meng,Qinghong Han,Fei Wu,Jiwei Li
标识
DOI:10.18653/v1/2020.acl-main.519
摘要

The task of named entity recognition (NER) is normally divided into nested NER and flat NER depending on whether named entities are nested or not.Models are usually separately developed for the two tasks, since sequence labeling models, the most widely used backbone for flat NER, are only able to assign a single label to a particular token, which is unsuitable for nested NER where a token may be assigned several labels. In this paper, we propose a unified framework that is capable of handling both flat and nested NER tasks. Instead of treating the task of NER as a sequence labeling problem, we propose to formulate it as a machine reading comprehension (MRC) task. For example, extracting entities with the per label is formalized as extracting answer spans to the question “which person is mentioned in the text".This formulation naturally tackles the entity overlapping issue in nested NER: the extraction of two overlapping entities with different categories requires answering two independent questions. Additionally, since the query encodes informative prior knowledge, this strategy facilitates the process of entity extraction, leading to better performances for not only nested NER, but flat NER. We conduct experiments on both nested and flat NER datasets.Experiment results demonstrate the effectiveness of the proposed formulation. We are able to achieve a vast amount of performance boost over current SOTA models on nested NER datasets, i.e., +1.28, +2.55, +5.44, +6.37,respectively on ACE04, ACE05, GENIA and KBP17, along with SOTA results on flat NER datasets, i.e., +0.24, +1.95, +0.21, +1.49 respectively on English CoNLL 2003, English OntoNotes 5.0, Chinese MSRA and Chinese OntoNotes 4.0.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
晨曦完成签到 ,获得积分10
刚刚
2秒前
量子星尘发布了新的文献求助10
7秒前
结实大白完成签到,获得积分10
8秒前
充电宝应助00采纳,获得10
10秒前
11秒前
liushuang_完成签到,获得积分10
13秒前
of完成签到,获得积分10
14秒前
云轩发布了新的文献求助10
15秒前
17秒前
蜀黍完成签到 ,获得积分10
19秒前
22秒前
22秒前
cctv18应助佳足采纳,获得10
24秒前
24秒前
liushuang_发布了新的文献求助10
25秒前
简时完成签到 ,获得积分10
25秒前
黑熊安巴尼完成签到,获得积分20
26秒前
29秒前
量子星尘发布了新的文献求助10
29秒前
Lucas应助liushuang_采纳,获得10
30秒前
32秒前
可爱的函函应助dt采纳,获得10
33秒前
33秒前
叮当完成签到,获得积分10
35秒前
132456发布了新的文献求助10
35秒前
白一寒发布了新的文献求助10
36秒前
37秒前
40秒前
40秒前
40秒前
乔一发布了新的文献求助30
41秒前
41秒前
hahamissyu完成签到,获得积分10
42秒前
乔心发布了新的文献求助10
44秒前
44秒前
偷狗的小月亮完成签到,获得积分10
44秒前
45秒前
SYJ发布了新的文献求助10
47秒前
47秒前
高分求助中
【提示信息,请勿应助】请使用合适的网盘上传文件 10000
The Oxford Encyclopedia of the History of Modern Psychology 1500
Green Star Japan: Esperanto and the International Language Question, 1880–1945 800
Sentimental Republic: Chinese Intellectuals and the Maoist Past 800
The Martian climate revisited: atmosphere and environment of a desert planet 800
Parametric Random Vibration 800
Building Quantum Computers 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3863397
求助须知:如何正确求助?哪些是违规求助? 3405714
关于积分的说明 10646239
捐赠科研通 3129398
什么是DOI,文献DOI怎么找? 1725887
邀请新用户注册赠送积分活动 831286
科研通“疑难数据库(出版商)”最低求助积分说明 779742