Natural language processing for automatic evaluation of free-text answers — a feasibility study based on the European Diploma in Radiology examination

短信 人工智能 自然语言处理 计算机科学 Python(编程语言) 多项选择 情报检索 医学 放射科 万维网 显著性差异 内科学 操作系统
作者
Fabian Stoehr,Benedikt Kämpgen,Lukáš Müller,Laura Oleaga,Vanesa Junquero,Cristina Merino,Peter Mildenberger,Roman Kloeckner
出处
期刊:Insights Into Imaging [Springer Nature]
卷期号:14 (1) 被引量:2
标识
DOI:10.1186/s13244-023-01507-5
摘要

Written medical examinations consist of multiple-choice questions and/or free-text answers. The latter require manual evaluation and rating, which is time-consuming and potentially error-prone. We tested whether natural language processing (NLP) can be used to automatically analyze free-text answers to support the review process.The European Board of Radiology of the European Society of Radiology provided representative datasets comprising sample questions, answer keys, participant answers, and reviewer markings from European Diploma in Radiology examinations. Three free-text questions with the highest number of corresponding answers were selected: Questions 1 and 2 were "unstructured" and required a typical free-text answer whereas question 3 was "structured" and offered a selection of predefined wordings/phrases for participants to use in their free-text answer. The NLP engine was designed using word lists, rule-based synonyms, and decision tree learning based on the answer keys and its performance tested against the gold standard of reviewer markings.After implementing the NLP approach in Python, F1 scores were calculated as a measure of NLP performance: 0.26 (unstructured question 1, n = 96), 0.33 (unstructured question 2, n = 327), and 0.5 (more structured question, n = 111). The respective precision/recall values were 0.26/0.27, 0.4/0.32, and 0.62/0.55.This study showed the successful design of an NLP-based approach for automatic evaluation of free-text answers in the EDiR examination. Thus, as a future field of application, NLP could work as a decision-support system for reviewers and support the design of examinations being adjusted to the requirements of an automated, NLP-based review process.Natural language processing can be successfully used to automatically evaluate free-text answers, performing better with more structured question-answer formats. Furthermore, this study provides a baseline for further work applying, e.g., more elaborated NLP approaches/large language models.• Free-text answers require manual evaluation, which is time-consuming and potentially error-prone. • We developed a simple NLP-based approach - requiring only minimal effort/modeling - to automatically analyze and mark free-text answers. • Our NLP engine has the potential to support the manual evaluation process. • NLP performance is better on a more structured question-answer format.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
litianchi完成签到,获得积分10
刚刚
刚刚
多吃不胖应助六六采纳,获得10
刚刚
万能图书馆应助舒心明杰采纳,获得10
1秒前
1秒前
圆圈圈发布了新的文献求助10
1秒前
充电宝应助花开米兰城采纳,获得10
2秒前
2秒前
tudou发布了新的文献求助10
2秒前
科研通AI2S应助合适元龙采纳,获得10
2秒前
善学以致用应助sail采纳,获得10
3秒前
大个应助鳗鱼元枫采纳,获得10
4秒前
森sen完成签到 ,获得积分10
4秒前
5秒前
斯文败类应助Asteroid采纳,获得10
7秒前
10秒前
Whywhy完成签到,获得积分10
10秒前
也是来学习了完成签到,获得积分10
10秒前
SciGPT应助啦啦啦采纳,获得10
10秒前
Jasper应助li采纳,获得10
11秒前
科研通AI6.3应助li采纳,获得10
11秒前
科研通AI6.3应助li采纳,获得10
11秒前
科研通AI6.3应助li采纳,获得30
11秒前
12秒前
12秒前
刘晓伟发布了新的文献求助10
12秒前
13秒前
zhzhzh完成签到,获得积分20
14秒前
14秒前
shishuang完成签到,获得积分10
14秒前
脑洞疼应助Whywhy采纳,获得10
14秒前
所所应助jiqihao采纳,获得10
15秒前
15秒前
旺仔发布了新的文献求助10
17秒前
吴是温完成签到,获得积分10
17秒前
谢耳朵完成签到,获得积分10
17秒前
Sake发布了新的文献求助10
17秒前
sail发布了新的文献求助10
18秒前
xiaoyibabi完成签到,获得积分10
18秒前
li完成签到,获得积分10
18秒前
高分求助中
Modern Epidemiology, Fourth Edition 5000
Kinesiophobia : a new view of chronic pain behavior 5000
Molecular Biology of Cancer: Mechanisms, Targets, and Therapeutics 3000
Propeller Design 2000
Weaponeering, Fourth Edition – Two Volume SET 2000
Handbook of pharmaceutical excipients, Ninth edition 1500
First commercial application of ELCRES™ HTV150A film in Nichicon capacitors for AC-DC inverters: SABIC at PCIM Europe 1000
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 纳米技术 化学工程 生物化学 物理 计算机科学 内科学 复合材料 催化作用 物理化学 光电子学 电极 冶金 细胞生物学 基因
热门帖子
关注 科研通微信公众号,转发送积分 6009561
求助须知:如何正确求助?哪些是违规求助? 7549828
关于积分的说明 16130772
捐赠科研通 5156016
什么是DOI,文献DOI怎么找? 2761802
邀请新用户注册赠送积分活动 1740079
关于科研通互助平台的介绍 1633147