Dedicated AI Expert System vs Generative AI With Large Language Model for Clinical Diagnoses

医学诊断 医学 考试(生物学) 鉴别诊断 临床诊断 诊断试验 儿科 家庭医学 病理 生物 古生物学
作者
Mitchell J. Feldman,Edward P. Hoffer,Jared Conley,Jaime Chang,Jeanhee Chung,Michael C. Jernigan,William Lester,Zachary H. Strasser,Henry C. Chueh
出处
期刊:JAMA network open [American Medical Association]
卷期号:8 (5): e2512994-e2512994
标识
DOI:10.1001/jamanetworkopen.2025.12994
摘要

Importance Large language models (LLMs) have not yet been compared with traditional diagnostic decision support systems (DDSSs) on unpublished clinical cases. Objective To compare the performance of 2 widely used LLMs (ChatGPT, version 4 [hereafter, LLM1 ] and Gemini, version 1.5 [hereafter, LLM2 ]) with a DDSS (DXplain [hereafter, DDSS ]) on 36 unpublished general medicine cases. Design, Setting, and Participants This diagnostic study, conducted from October 6, 2023, to November 22, 2024, looked for the presence of the known case diagnosis in the differential diagnoses of the LLMs and DDSS after data from previously unpublished clinical cases from 3 academic medical centers were entered. The systems’ performance was assessed both with and without laboratory test data. Each case was reviewed by 3 physicians blinded to the case diagnosis. Physicians identified all clinical findings as well as the subset deemed relevant to making the diagnosis for mapping to the DDSS’s controlled vocabulary. Two other physicians, also blinded to the diagnoses, entered the data from these cases into the DDSS, LLM1, and LLM2. Exposures All cases were entered into each LLM twice, with and without laboratory test results. For the DDSS, each case was entered 4 times: for all findings and for findings relevant to the diagnosis, each with and without laboratory test results. The top 25 diagnoses in each resulting differential diagnosis were reviewed. Main Outcomes and Measures Presence or absence of the case diagnosis in the system’s differential diagnosis and, when present, in which quintile it appeared in the top 25 diagnoses. Results Among 36 patient cases of various races and ethnicities, genders, and ages (mean [SD] age, 51.4 [16.4] years), in the version with all findings but no laboratory test results, the DDSS listed the case diagnosis in its differential diagnosis more often (56% [20 of 36]) than LLM1 (42% [15 of 36]) and LLM2 (39% [14 of 36]), although this difference did not reach statistical significance (DDSS vs LLMI, P = .09; DDSS vs LLM2, P = .08). All 3 systems listed the case diagnosis in most cases if laboratory test results were included (all findings DDSS, 72% [26 of 36]; LLM1, 64% [23 of 36]; and LLM2, 58% [21 of 36]). Conclusions and Relevance In this diagnostic study comparing the performance of a traditional DDSS and current LLMs on unpublished clinical cases, in most cases, every system listed the case diagnosis in their top 25 diagnoses if laboratory test results were included. A hybrid approach that combines the parsing and expository linguistic capabilities of LLMs with the deterministic and explanatory capabilities of traditional DDSSs may produce synergistic benefits.

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
爱学习的胡小毛完成签到,获得积分20
刚刚
3秒前
华仔应助keyan采纳,获得10
6秒前
五十一完成签到 ,获得积分10
7秒前
赘婿应助Windintone采纳,获得10
8秒前
9秒前
Lucas应助科研通管家采纳,获得10
9秒前
传奇3应助科研通管家采纳,获得10
9秒前
9秒前
彭于晏应助科研通管家采纳,获得10
9秒前
CipherSage应助科研通管家采纳,获得10
9秒前
12334发布了新的文献求助10
9秒前
英姑应助研友_Z1eDgZ采纳,获得200
13秒前
14秒前
17秒前
keyan发布了新的文献求助10
21秒前
Waitcy完成签到,获得积分10
27秒前
yjf完成签到,获得积分10
27秒前
Orange应助keyan采纳,获得10
28秒前
39秒前
40秒前
幽谷客发布了新的文献求助10
43秒前
旱田蜗牛发布了新的文献求助10
45秒前
45秒前
自觉画笔完成签到 ,获得积分10
50秒前
老婶子发布了新的文献求助10
50秒前
眼睛大的冰岚完成签到,获得积分10
51秒前
52秒前
烂漫的飞松完成签到,获得积分10
54秒前
Hulda发布了新的文献求助10
55秒前
乔恶霸完成签到 ,获得积分10
56秒前
OIIII发布了新的文献求助10
57秒前
58秒前
keyan发布了新的文献求助10
59秒前
1分钟前
一个土豆完成签到,获得积分10
1分钟前
1分钟前
1分钟前
zzz发布了新的文献求助10
1分钟前
Cherish发布了新的文献求助10
1分钟前
高分求助中
ФОРМИРОВАНИЕ АО "МЕЖДУНАРОДНАЯ КНИГА" КАК ВАЖНЕЙШЕЙ СИСТЕМЫ ОТЕЧЕСТВЕННОГО КНИГОРАСПРОСТРАНЕНИЯ 3000
Electron microscopy study of magnesium hydride (MgH2) for Hydrogen Storage 1000
生物降解型栓塞微球市场(按产品类型、应用和最终用户)- 2030 年全球预测 500
Nucleophilic substitution in azasydnone-modified dinitroanisoles 500
ACSM's guidelines for exercise testing and prescription, 12 ed 500
Quantum Computing for Quantum Chemistry 500
Thermal Expansion of Solids (CINDAS Data Series on Material Properties, v. I-4) 470
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3900627
求助须知:如何正确求助?哪些是违规求助? 3445350
关于积分的说明 10839515
捐赠科研通 3170477
什么是DOI,文献DOI怎么找? 1751617
邀请新用户注册赠送积分活动 846830
科研通“疑难数据库(出版商)”最低求助积分说明 789472