计算机科学
医学术语
术语
注释
人工智能
自然语言处理
匹配(统计)
命名实体识别
词(群论)
情报检索
语言学
经济
任务(项目管理)
管理
哲学
统计
数学
作者
Zhichao Zhu,Jianqiang Li,Zhao Quan,Faheem Akhtar
标识
DOI:10.1016/j.eswa.2023.120709
摘要
Biomedical named entity recognition (BNER) is a critical task for biomedical information extraction. Most popular BNER approaches based on deep learning utilize words and characters as features to represent medical texts. However, many medical terminologies are composed of multiple words and characters, and splitting medical terminology into multiple words (or characters) and assigning weight values for each word (or character) by a standard attention mechanism may disperse the attention score and result in a lower weight value for the medical terminology. This paper proposes a Dictionary-guided Attention Network (DGAN) for BNER in Chinese electronic medical records (EMRs). First, the medical concepts are extracted as large-size words to supplement the comprehensive semantic information of the medical terminology by matching the EMR text to the biomedical dictionary. Then, based on the matched dictionary results, an optimized attention strategy is proposed to focus on the medical concept and adaptively assign higher weights to the characters contained in a concept. Furthermore, semisupervised learning is introduced to reduce the manual labeling of data and to handle the entities not defined in the medical dictionary. To validate our new model in recognizing biomedical named entities, we conduct comprehensive experiments on a real-world Chinese EMR dataset and the CCKS2017 dataset. Our promising results illustrate that our method not only achieves a state-of-the-art performance in BNER but also reduces manual data annotation.
科研通智能强力驱动
Strongly Powered by AbleSci AI