ChatGPT-Generated Differential Diagnosis Lists for Complex Case–Derived Clinical Vignettes: Diagnostic Accuracy Evaluation

鉴别诊断 医学诊断 医学 放射科 病理
作者
Takanobu Hirosawa,Ren Kawamura,Yukinori Harada,Kazuya Mizuta,Kazuki Tokumasu,Yuki Kaji,Tomoharu Suzuki,Taro Shimizu
出处
期刊:JMIR medical informatics [JMIR Publications Inc.]
卷期号:11: e48808-e48808 被引量:103
标识
DOI:10.2196/48808
摘要

Background The diagnostic accuracy of differential diagnoses generated by artificial intelligence chatbots, including ChatGPT models, for complex clinical vignettes derived from general internal medicine (GIM) department case reports is unknown. Objective This study aims to evaluate the accuracy of the differential diagnosis lists generated by both third-generation ChatGPT (ChatGPT-3.5) and fourth-generation ChatGPT (ChatGPT-4) by using case vignettes from case reports published by the Department of GIM of Dokkyo Medical University Hospital, Japan. Methods We searched PubMed for case reports. Upon identification, physicians selected diagnostic cases, determined the final diagnosis, and displayed them into clinical vignettes. Physicians typed the determined text with the clinical vignettes in the ChatGPT-3.5 and ChatGPT-4 prompts to generate the top 10 differential diagnoses. The ChatGPT models were not specially trained or further reinforced for this task. Three GIM physicians from other medical institutions created differential diagnosis lists by reading the same clinical vignettes. We measured the rate of correct diagnosis within the top 10 differential diagnosis lists, top 5 differential diagnosis lists, and the top diagnosis. Results In total, 52 case reports were analyzed. The rates of correct diagnosis by ChatGPT-4 within the top 10 differential diagnosis lists, top 5 differential diagnosis lists, and top diagnosis were 83% (43/52), 81% (42/52), and 60% (31/52), respectively. The rates of correct diagnosis by ChatGPT-3.5 within the top 10 differential diagnosis lists, top 5 differential diagnosis lists, and top diagnosis were 73% (38/52), 65% (34/52), and 42% (22/52), respectively. The rates of correct diagnosis by ChatGPT-4 were comparable to those by physicians within the top 10 (43/52, 83% vs 39/52, 75%, respectively; P=.47) and within the top 5 (42/52, 81% vs 35/52, 67%, respectively; P=.18) differential diagnosis lists and top diagnosis (31/52, 60% vs 26/52, 50%, respectively; P=.43) although the difference was not significant. The ChatGPT models’ diagnostic accuracy did not significantly vary based on open access status or the publication date (before 2011 vs 2022). Conclusions This study demonstrates the potential diagnostic accuracy of differential diagnosis lists generated using ChatGPT-3.5 and ChatGPT-4 for complex clinical vignettes from case reports published by the GIM department. The rate of correct diagnoses within the top 10 and top 5 differential diagnosis lists generated by ChatGPT-4 exceeds 80%. Although derived from a limited data set of case reports from a single department, our findings highlight the potential utility of ChatGPT-4 as a supplementary tool for physicians, particularly for those affiliated with the GIM department. Further investigations should explore the diagnostic accuracy of ChatGPT by using distinct case materials beyond its training data. Such efforts will provide a comprehensive insight into the role of artificial intelligence in enhancing clinical decision-making.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
YJ888完成签到,获得积分20
刚刚
852应助青辣椒采纳,获得10
刚刚
1秒前
HH发布了新的文献求助10
1秒前
HAHA发布了新的文献求助10
2秒前
浮游完成签到,获得积分0
2秒前
3秒前
4秒前
九沂完成签到,获得积分10
4秒前
4秒前
CodeCraft应助挈宇采纳,获得10
5秒前
6秒前
madao完成签到,获得积分20
7秒前
华仔应助wangzhiyi采纳,获得10
8秒前
乐乐发布了新的文献求助10
9秒前
勤奋的访云完成签到 ,获得积分10
10秒前
10秒前
涔雨发布了新的文献求助10
11秒前
TK完成签到 ,获得积分10
12秒前
是是是完成签到,获得积分20
12秒前
赘婿应助雪白筝采纳,获得10
12秒前
科研通AI6.3应助Dongjie采纳,获得10
13秒前
hui完成签到,获得积分10
15秒前
doppelganger发布了新的文献求助10
15秒前
李健应助柏木了采纳,获得10
16秒前
英姑应助andy采纳,获得10
16秒前
烤鱼的夹克完成签到,获得积分10
17秒前
青争完成签到,获得积分10
19秒前
顾矜应助renyun采纳,获得10
19秒前
昏睡的丹琴完成签到,获得积分10
20秒前
22秒前
斯文败类应助Mr兔仙森采纳,获得10
22秒前
852应助Faye采纳,获得10
23秒前
科研通AI6.3应助pjb采纳,获得30
25秒前
慕青应助生动的凝蕊采纳,获得10
25秒前
科研通AI6.2应助doppelganger采纳,获得10
25秒前
25秒前
27秒前
27秒前
顾矜应助YJ888采纳,获得10
28秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Modern Epidemiology, Fourth Edition 5000
Digital Twins of Advanced Materials Processing 2000
Weaponeering, Fourth Edition – Two Volume SET 2000
Polymorphism and polytypism in crystals 1000
Signals, Systems, and Signal Processing 610
Discrete-Time Signals and Systems 610
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 纳米技术 有机化学 物理 生物化学 化学工程 计算机科学 复合材料 内科学 催化作用 光电子学 物理化学 电极 冶金 遗传学 细胞生物学
热门帖子
关注 科研通微信公众号,转发送积分 6025081
求助须知:如何正确求助?哪些是违规求助? 7659914
关于积分的说明 16178336
捐赠科研通 5173305
什么是DOI,文献DOI怎么找? 2768128
邀请新用户注册赠送积分活动 1751546
关于科研通互助平台的介绍 1637642