ChatGPT-Generated Differential Diagnosis Lists for Complex Case–Derived Clinical Vignettes: Diagnostic Accuracy Evaluation

鉴别诊断 医学诊断 医学 放射科 病理
作者
Takanobu Hirosawa,Ren Kawamura,Yukinori Harada,Kazuya Mizuta,Kazuki Tokumasu,Yuki Kaji,Tomoharu Suzuki,Taro Shimizu
出处
期刊:JMIR medical informatics [JMIR Publications Inc.]
卷期号:11: e48808-e48808 被引量:103
标识
DOI:10.2196/48808
摘要

Background The diagnostic accuracy of differential diagnoses generated by artificial intelligence chatbots, including ChatGPT models, for complex clinical vignettes derived from general internal medicine (GIM) department case reports is unknown. Objective This study aims to evaluate the accuracy of the differential diagnosis lists generated by both third-generation ChatGPT (ChatGPT-3.5) and fourth-generation ChatGPT (ChatGPT-4) by using case vignettes from case reports published by the Department of GIM of Dokkyo Medical University Hospital, Japan. Methods We searched PubMed for case reports. Upon identification, physicians selected diagnostic cases, determined the final diagnosis, and displayed them into clinical vignettes. Physicians typed the determined text with the clinical vignettes in the ChatGPT-3.5 and ChatGPT-4 prompts to generate the top 10 differential diagnoses. The ChatGPT models were not specially trained or further reinforced for this task. Three GIM physicians from other medical institutions created differential diagnosis lists by reading the same clinical vignettes. We measured the rate of correct diagnosis within the top 10 differential diagnosis lists, top 5 differential diagnosis lists, and the top diagnosis. Results In total, 52 case reports were analyzed. The rates of correct diagnosis by ChatGPT-4 within the top 10 differential diagnosis lists, top 5 differential diagnosis lists, and top diagnosis were 83% (43/52), 81% (42/52), and 60% (31/52), respectively. The rates of correct diagnosis by ChatGPT-3.5 within the top 10 differential diagnosis lists, top 5 differential diagnosis lists, and top diagnosis were 73% (38/52), 65% (34/52), and 42% (22/52), respectively. The rates of correct diagnosis by ChatGPT-4 were comparable to those by physicians within the top 10 (43/52, 83% vs 39/52, 75%, respectively; P=.47) and within the top 5 (42/52, 81% vs 35/52, 67%, respectively; P=.18) differential diagnosis lists and top diagnosis (31/52, 60% vs 26/52, 50%, respectively; P=.43) although the difference was not significant. The ChatGPT models’ diagnostic accuracy did not significantly vary based on open access status or the publication date (before 2011 vs 2022). Conclusions This study demonstrates the potential diagnostic accuracy of differential diagnosis lists generated using ChatGPT-3.5 and ChatGPT-4 for complex clinical vignettes from case reports published by the GIM department. The rate of correct diagnoses within the top 10 and top 5 differential diagnosis lists generated by ChatGPT-4 exceeds 80%. Although derived from a limited data set of case reports from a single department, our findings highlight the potential utility of ChatGPT-4 as a supplementary tool for physicians, particularly for those affiliated with the GIM department. Further investigations should explore the diagnostic accuracy of ChatGPT by using distinct case materials beyond its training data. Such efforts will provide a comprehensive insight into the role of artificial intelligence in enhancing clinical decision-making.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
陈豆豆发布了新的文献求助20
1秒前
2秒前
书记发布了新的文献求助10
2秒前
邹友亮完成签到,获得积分10
3秒前
4秒前
1364135702完成签到 ,获得积分10
5秒前
vtfangfangfang完成签到,获得积分10
7秒前
飞快的从丹完成签到,获得积分10
7秒前
aaaaaa发布了新的文献求助10
8秒前
8秒前
王丽婕关注了科研通微信公众号
8秒前
8秒前
9秒前
刘雪完成签到 ,获得积分10
10秒前
酷波er应助L44采纳,获得10
10秒前
搜集达人应助牛京采纳,获得10
11秒前
书记发布了新的文献求助10
11秒前
12秒前
DrKorla发布了新的文献求助10
13秒前
Steven发布了新的文献求助10
13秒前
Lucas应助陈豆豆采纳,获得10
13秒前
zz发布了新的文献求助10
14秒前
王文涛完成签到,获得积分10
14秒前
波尔完成签到,获得积分10
14秒前
ds完成签到,获得积分10
16秒前
17秒前
量子星尘发布了新的文献求助10
17秒前
ekun完成签到,获得积分10
17秒前
小二郎应助落寞白曼采纳,获得10
18秒前
18秒前
19秒前
水蜜桃发布了新的文献求助10
19秒前
Jin完成签到,获得积分20
19秒前
元半仙完成签到,获得积分10
20秒前
SciGPT应助punker采纳,获得10
20秒前
呀咪完成签到 ,获得积分10
20秒前
20秒前
21秒前
王文涛发布了新的文献求助10
21秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
The Social Work Ethics Casebook: Cases and Commentary (revised 2nd ed.).. Frederic G. Reamer 1070
Introduction to Early Childhood Education 1000
2025-2031年中国兽用抗生素行业发展深度调研与未来趋势报告 1000
List of 1,091 Public Pension Profiles by Region 871
Alloy Phase Diagrams 500
A Guide to Genetic Counseling, 3rd Edition 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 物理化学 基因 遗传学 催化作用 冶金 量子力学 光电子学
热门帖子
关注 科研通微信公众号,转发送积分 5420650
求助须知:如何正确求助?哪些是违规求助? 4535678
关于积分的说明 14151067
捐赠科研通 4452621
什么是DOI,文献DOI怎么找? 2442367
邀请新用户注册赠送积分活动 1433789
关于科研通互助平台的介绍 1410975