清晨好,您是今天最早来到科研通的研友!由于当前在线用户较少,发布求助请尽量完整的填写文献信息,科研通机器人24小时在线,伴您科研之路漫漫前行!

Physician- and Large Language Model–Generated Hospital Discharge Summaries

医学 叙述的 利克特量表 比例(比率) 家庭医学 医疗保健 文档 统计 数学 物理 哲学 量子力学 经济 经济增长 程序设计语言 语言学 计算机科学
作者
Christopher Y. K. Williams,Charumathi Raghu Subramanian,Syed Salman Ali,Michael Apolinario,Elisabeth Askin,Peter Barish,Monica Cheng,William James Deardorff,Nisha Donthi,Smitha Ganeshan,Owen Huang,Molly A. Kantor,Andrew Lai,Ashley Manchanda,Kendra A. Moore,Anoop Muniyappa,Geethu Nair,Prashant Patel,Lekshmi Santhosh,Susan Schneider
出处
期刊:JAMA Internal Medicine [American Medical Association]
标识
DOI:10.1001/jamainternmed.2025.0821
摘要

Importance High-quality discharge summaries are associated with improved patient outcomes, but contribute to clinical documentation burden. Large language models (LLMs) provide an opportunity to support physicians by drafting discharge summary narratives. Objective To determine whether LLM-generated discharge summary narratives are of comparable quality and safety to those of physicians. Design, Setting, and Participants This cross-sectional study conducted at the University of California, San Francisco included 100 randomly selected inpatient hospital medicine encounters of 3 to 6 days’ duration between 2019 and 2022. The analysis took place in July 2024. Exposure A blinded evaluation of physician- and LLM-generated narratives was performed in duplicate by 22 attending physician reviewers. Main Outcomes and Measures Narratives were reviewed for overall quality, reviewer preference, comprehensiveness, concision, coherence, and 3 error types (inaccuracies, omissions, and hallucinations). Each error individually, and each narrative overall, were assigned potential harmfulness scores ranging from 0 to 7 on an adapted Agency for Healthcare Research and Quality scale. Results Across 100 encounters, LLM- and physician-generated narratives were comparable in overall quality on a Likert scale ranging from 1 to 5 (higher scores indicate higher quality; mean [SD] score, 3.67 [0.49] vs 3.77 [0.57]; P = .21) and reviewer preference (χ 2 = 5.2; P = .27). LLM-generated narratives were more concise (mean [SD] score, 4.01 [0.37] vs 3.70 [0.59]; P < .001) and more coherent (mean [SD] score, 4.16 [0.39] vs 4.01 [0.53]; P = .02) than their physician-generated counterparts, but less comprehensive (mean [SD] score, 3.72 [0.58] vs 4.13 [0.58]; P < .001). LLM-generated narratives contained more unique errors (mean [SD] errors per summary, 2.91 [2.54]) than physician-generated narratives (mean [SD] errors per summary, 1.82 [1.94]). There was no significant difference in the potential for harm between LLM- and physician-generated narratives across individual errors (mean [SD] of 1.35 [1.07] vs 1.34 [1.05]; P = .99), with 6 and 5 individual errors, respectively, with scores of 4 (potential for permanent harm) or greater. Both LLM- and physician-generated narratives had low overall potential for harm (scores <1 on a scale ranging from 0-7), with LLM-generated narratives scoring higher than physician narratives (mean [SD] score of 0.84 [0.98] vs 0.36 [0.70]; P < .001) and only 1 LLM-generated narrative (compared with 0 physician-generated narratives) scoring 4 or greater. Conclusions and Relevance In this cross-sectional study of 100 inpatient hospital medicine encounters, LLM-generated discharge summary narratives were of comparable quality, and were preferred equally, to those generated by physicians. LLM-generated narratives were more likely to contain errors but had low overall harmfulness scores. These results suggest that, in clinical practice, using such narratives after human review may provide a viable option for hospitalists.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
神勇的天问完成签到 ,获得积分10
4秒前
量子星尘发布了新的文献求助10
6秒前
可夫司机完成签到 ,获得积分10
19秒前
阳炎完成签到,获得积分10
28秒前
Ringo完成签到 ,获得积分10
34秒前
yuiip完成签到 ,获得积分10
35秒前
JuliaWang完成签到 ,获得积分10
41秒前
量子星尘发布了新的文献求助10
48秒前
星辰大海应助敏感的秋凌采纳,获得10
1分钟前
1分钟前
量子星尘发布了新的文献求助10
1分钟前
执意完成签到 ,获得积分10
1分钟前
Lillianzhu1完成签到,获得积分10
1分钟前
假装学霸完成签到 ,获得积分10
1分钟前
1分钟前
量子星尘发布了新的文献求助30
1分钟前
勤恳的TT完成签到 ,获得积分10
1分钟前
凸迩丝儿完成签到 ,获得积分10
1分钟前
2分钟前
CHEN完成签到 ,获得积分10
2分钟前
2分钟前
木子草甜完成签到,获得积分10
2分钟前
量子星尘发布了新的文献求助10
2分钟前
wodetaiyangLLL完成签到 ,获得积分10
2分钟前
量子星尘发布了新的文献求助10
2分钟前
cfsyyfujia完成签到 ,获得积分10
3分钟前
kuyi完成签到 ,获得积分10
3分钟前
量子星尘发布了新的文献求助10
3分钟前
3分钟前
3分钟前
3分钟前
方博发布了新的文献求助10
3分钟前
3分钟前
3分钟前
量子星尘发布了新的文献求助10
4分钟前
量子星尘发布了新的文献求助10
4分钟前
心信鑫完成签到 ,获得积分10
4分钟前
在水一方应助方博采纳,获得10
4分钟前
4分钟前
qhdsyxy完成签到 ,获得积分0
4分钟前
高分求助中
【提示信息,请勿应助】请使用合适的网盘上传文件 10000
The Oxford Encyclopedia of the History of Modern Psychology 1500
Green Star Japan: Esperanto and the International Language Question, 1880–1945 800
Sentimental Republic: Chinese Intellectuals and the Maoist Past 800
The Martian climate revisited: atmosphere and environment of a desert planet 800
The Psychology of Advertising (5th edition) 500
Electron microscopy study of magnesium hydride (MgH2) for Hydrogen Storage 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3865751
求助须知:如何正确求助?哪些是违规求助? 3408343
关于积分的说明 10657160
捐赠科研通 3132318
什么是DOI,文献DOI怎么找? 1727549
邀请新用户注册赠送积分活动 832351
科研通“疑难数据库(出版商)”最低求助积分说明 780242