Can ChatGPT-4 evaluate whether a differential diagnosis list contains the correct diagnosis as accurately as a physician?

医学诊断 科恩卡帕 鉴别诊断 卡帕 医学 医学物理学 儿科 放射科 计算机科学 机器学习 病理 数学 几何学
作者
Kazuya Mizuta,Takanobu Hirosawa,Yukinori Harada,Taro Shimizu
出处
期刊:Diagnosis [De Gruyter]
被引量:3
标识
DOI:10.1515/dx-2024-0027
摘要

Abstract Objectives The potential of artificial intelligence (AI) chatbots, particularly the fourth-generation chat generative pretrained transformer (ChatGPT-4), in assisting with medical diagnosis is an emerging research area. While there has been significant emphasis on creating lists of differential diagnoses, it is not yet clear how well AI chatbots can evaluate whether the final diagnosis is included in these lists. This short communication aimed to assess the accuracy of ChatGPT-4 in evaluating lists of differential diagnosis compared to medical professionals’ assessments. Methods We used ChatGPT-4 to evaluate whether the final diagnosis was included in the top 10 differential diagnosis lists created by physicians, ChatGPT-3, and ChatGPT-4, using clinical vignettes. Eighty-two clinical vignettes were used, comprising 52 complex case reports published by the authors from the department and 30 mock cases of common diseases created by physicians from the same department. We compared the agreement between ChatGPT-4 and the physicians on whether the final diagnosis was included in the top 10 differential diagnosis lists using the kappa coefficient. Results Three sets of differential diagnoses were evaluated for each of the 82 cases, resulting in a total of 246 lists. The agreement rate between ChatGPT-4 and physicians was 236 out of 246 (95.9 %), with a kappa coefficient of 0.86, indicating very good agreement. Conclusions ChatGPT-4 demonstrated very good agreement with physicians in evaluating whether the final diagnosis should be included in the differential diagnosis lists.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
绵绵冰完成签到 ,获得积分10
刚刚
7275XXX发布了新的文献求助10
1秒前
自如完成签到,获得积分10
2秒前
3秒前
3秒前
小可爱完成签到 ,获得积分10
3秒前
小乐完成签到,获得积分10
3秒前
羊羊发布了新的文献求助10
4秒前
4秒前
大模型应助小宋采纳,获得10
4秒前
4秒前
jjjjj完成签到,获得积分20
5秒前
大气灵枫完成签到,获得积分10
5秒前
5秒前
方可完成签到,获得积分10
6秒前
与光同晨完成签到,获得积分10
6秒前
怡然白竹发布了新的文献求助10
7秒前
neurodawn发布了新的文献求助10
7秒前
8秒前
allove发布了新的文献求助10
9秒前
9秒前
SID完成签到,获得积分10
9秒前
乐乐应助小罗黑的采纳,获得10
10秒前
秋山伊夫完成签到,获得积分10
10秒前
10秒前
7275XXX完成签到,获得积分10
11秒前
晚晚完成签到,获得积分10
11秒前
Jasper应助蔡新蕊采纳,获得10
12秒前
12秒前
shenyanlei发布了新的文献求助10
12秒前
XX发布了新的文献求助10
12秒前
果不欺然发布了新的文献求助10
12秒前
OngJi发布了新的文献求助10
13秒前
殊桐完成签到,获得积分10
13秒前
14秒前
852应助shshjzh采纳,获得10
14秒前
活泼芷文发布了新的文献求助10
15秒前
CC7012发布了新的文献求助10
15秒前
15秒前
高分求助中
(禁止应助)【重要!!请各位详细阅读】【科研通的精品贴汇总】 10000
Functional High Entropy Alloys and Compounds 1000
Building Quantum Computers 1000
Apiaceae Himalayenses. 2 500
Molecular Cloning: A Laboratory Manual (Fourth Edition) 500
Social Epistemology: The Niches for Knowledge and Ignorance 500
优秀运动员运动寿命的人文社会学因素研究 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 冶金 细胞生物学 免疫学
热门帖子
关注 科研通微信公众号,转发送积分 4239197
求助须知:如何正确求助?哪些是违规求助? 3772920
关于积分的说明 11848818
捐赠科研通 3428754
什么是DOI,文献DOI怎么找? 1881756
邀请新用户注册赠送积分活动 933920
科研通“疑难数据库(出版商)”最低求助积分说明 840611