Information extraction from weakly structured radiological reports with natural language queries

放射性武器 医学 神经组阅片室 介入放射学 信息抽取 情报检索 放射科 计算机科学 神经学 精神科
作者
Amin Dada,Tim Leon Ufer,Moon Kim,Max Hasin,Nicola Spieker,Michael Forsting,Felix Nensa,Jan Egger,Jens Kleesiek
出处
期刊:European Radiology [Springer Science+Business Media]
卷期号:34 (1): 330-337 被引量:6
标识
DOI:10.1007/s00330-023-09977-3
摘要

Abstract Objectives Provide physicians and researchers an efficient way to extract information from weakly structured radiology reports with natural language processing (NLP) machine learning models. Methods We evaluate seven different German bidirectional encoder representations from transformers (BERT) models on a dataset of 857,783 unlabeled radiology reports and an annotated reading comprehension dataset in the format of SQuAD 2.0 based on 1223 additional reports. Results Continued pre-training of a BERT model on the radiology dataset and a medical online encyclopedia resulted in the most accurate model with an F1-score of 83.97% and an exact match score of 71.63% for answerable questions and 96.01% accuracy in detecting unanswerable questions. Fine-tuning a non-medical model without further pre-training led to the lowest-performing model. The final model proved stable against variation in the formulations of questions and in dealing with questions on topics excluded from the training set. Conclusions General domain BERT models further pre-trained on radiological data achieve high accuracy in answering questions on radiology reports. We propose to integrate our approach into the workflow of medical practitioners and researchers to extract information from radiology reports. Clinical relevance statement By reducing the need for manual searches of radiology reports, radiologists’ resources are freed up, which indirectly benefits patients. Key Points • BERT models pre-trained on general domain datasets and radiology reports achieve high accuracy (83.97% F1-score) on question-answering for radiology reports. • The best performing model achieves an F1-score of 83.97% for answerable questions and 96.01% accuracy for questions without an answer. • Additional radiology-specific pretraining of all investigated BERT models improves their performance. Graphical Abstract

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
哈哈哈哈哈完成签到,获得积分10
1秒前
1秒前
strive完成签到,获得积分10
2秒前
纯粹发布了新的文献求助10
3秒前
4秒前
4秒前
jx完成签到,获得积分10
5秒前
dddddddio完成签到,获得积分20
5秒前
的虔完成签到,获得积分10
6秒前
8秒前
李健的小迷弟应助qweerrtt采纳,获得10
8秒前
8秒前
9秒前
二两白茶发布了新的文献求助10
10秒前
明轩完成签到,获得积分10
10秒前
10秒前
Orange应助于生有你采纳,获得10
10秒前
DXM完成签到 ,获得积分10
11秒前
11秒前
13秒前
海的呼唤完成签到,获得积分10
13秒前
lifeilong111完成签到,获得积分10
13秒前
陈木子发布了新的文献求助30
13秒前
杰尼龟006完成签到,获得积分10
13秒前
14秒前
wjj123应助不吃西瓜采纳,获得10
14秒前
14秒前
小泓发布了新的文献求助10
15秒前
hyiyi发布了新的文献求助10
15秒前
15秒前
lin发布了新的文献求助10
16秒前
xiaobu完成签到,获得积分10
16秒前
淡定听兰完成签到,获得积分10
17秒前
纯粹完成签到,获得积分10
18秒前
fuxiao完成签到 ,获得积分10
18秒前
19秒前
小羊不想坐牢完成签到,获得积分10
19秒前
华仔应助陈木子采纳,获得10
19秒前
19秒前
Maocan发布了新的文献求助10
20秒前
高分求助中
Les Mantodea de Guyane: Insecta, Polyneoptera [The Mantids of French Guiana] 2000
The Oxford Encyclopedia of the History of Modern Psychology 2000
Chinesen in Europa – Europäer in China: Journalisten, Spione, Studenten 1200
Deutsche in China 1920-1950 1200
Synthesis of 21-Thioalkanoic Acids of Corticosteroids 1000
Electron microscopy study of magnesium hydride (MgH2) for Hydrogen Storage 1000
Applied Survey Data Analysis (第三版, 2025) 850
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3884047
求助须知:如何正确求助?哪些是违规求助? 3426343
关于积分的说明 10748277
捐赠科研通 3151146
什么是DOI,文献DOI怎么找? 1739372
邀请新用户注册赠送积分活动 839667
科研通“疑难数据库(出版商)”最低求助积分说明 784809