Development and Evaluation of a Retrieval-Augmented Large Language Model Framework for Ophthalmology

医学 基线(sea) 排名(信息检索) 梅德林 医疗保健 医学物理学 人工智能 计算机科学 政治学 法学 海洋学 经济增长 经济 地质学
作者
Mingjie Luo,Jianyu Pang,Shaowei Bi,Yunxi Lai,Jiaman Zhao,Yuanrui Shang,Tingxin Cui,Yahan Yang,Zhenzhe Lin,Lanqin Zhao,Xiaohang Wu,Duoru Lin,Jingjing Chen,Haotian Lin
出处
期刊:JAMA Ophthalmology [American Medical Association]
卷期号:142 (9): 798-798 被引量:38
标识
DOI:10.1001/jamaophthalmol.2024.2513
摘要

Importance Although augmenting large language models (LLMs) with knowledge bases may improve medical domain–specific performance, practical methods are needed for local implementation of LLMs that address privacy concerns and enhance accessibility for health care professionals. Objective To develop an accurate, cost-effective local implementation of an LLM to mitigate privacy concerns and support their practical deployment in health care settings. Design, Setting, and Participants ChatZOC (Sun Yat-Sen University Zhongshan Ophthalmology Center), a retrieval-augmented LLM framework, was developed by enhancing a baseline LLM with a comprehensive ophthalmic dataset and evaluation framework (CODE), which includes over 30 000 pieces of ophthalmic knowledge. This LLM was benchmarked against 10 representative LLMs, including GPT-4 and GPT-3.5 Turbo (OpenAI), across 300 clinical questions in ophthalmology. The evaluation, involving a panel of medical experts and biomedical researchers, focused on accuracy, utility, and safety. A double-masked approach was used to try to minimize bias assessment across all models. The study used a comprehensive knowledge base derived from ophthalmic clinical practice, without directly involving clinical patients. Exposures LLM response to clinical questions. Main Outcomes and Measures Accuracy, utility, and safety of LLMs in responding to clinical questions. Results The baseline model achieved a human ranking score of 0.48. The retrieval-augmented LLM had a score of 0.60, a difference of 0.12 (95% CI, 0.02-0.22; P = .02) from baseline and not different from GPT-4 with a score of 0.61 (difference = 0.01; 95% CI, −0.11 to 0.13; P = .89). For scientific consensus, the retrieval-augmented LLM was 84.0% compared with the baseline model of 46.5% (difference = 37.5%; 95% CI, 29.0%-46.0%; P < .001) and not different from GPT-4 with a value of 79.2% (difference = 4.8%; 95% CI, −0.3% to 10.0%; P = .06). Conclusions and Relevance Results of this quality improvement study suggest that the integration of high-quality knowledge bases improved the LLM’s performance in medical domains. This study highlights the transformative potential of augmented LLMs in clinical practice by providing reliable, safe, and practical clinical information. Further research is needed to explore the broader application of such frameworks in the real world.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
啊锋完成签到,获得积分20
刚刚
刚刚
3秒前
科研通AI2S应助hanxuepenyun采纳,获得10
5秒前
5秒前
bkagyin应助cm515531采纳,获得10
6秒前
fgehfd完成签到,获得积分10
7秒前
超帅靖雁发布了新的文献求助10
9秒前
10秒前
11秒前
xixi完成签到,获得积分10
13秒前
淡淡东蒽发布了新的文献求助10
16秒前
19秒前
纯奶发布了新的文献求助10
21秒前
Glen7发布了新的文献求助10
21秒前
22秒前
1521909494发布了新的文献求助10
22秒前
谷之森完成签到,获得积分10
22秒前
23秒前
科目三应助hanxuepenyun采纳,获得30
26秒前
sc关闭了sc文献求助
29秒前
30秒前
小茶妃雅完成签到,获得积分20
30秒前
彳亍完成签到,获得积分10
31秒前
啊锋发布了新的文献求助30
31秒前
UHPC完成签到,获得积分10
32秒前
33秒前
彭于晏应助super采纳,获得10
33秒前
丘比特应助范小勤子采纳,获得10
33秒前
希望天下0贩的0应助lhy采纳,获得10
34秒前
35秒前
小茶妃雅发布了新的文献求助10
36秒前
yeluoyezhi发布了新的文献求助30
37秒前
38秒前
范棒棒发布了新的文献求助10
38秒前
cm515531发布了新的文献求助10
40秒前
ycxlb完成签到,获得积分10
42秒前
CC发布了新的文献求助10
43秒前
lhy完成签到,获得积分20
43秒前
UHPC发布了新的文献求助10
44秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Modern Epidemiology, Fourth Edition 5000
Digital Twins of Advanced Materials Processing 2000
Weaponeering, Fourth Edition – Two Volume SET 2000
Polymorphism and polytypism in crystals 1000
Signals, Systems, and Signal Processing 610
Discrete-Time Signals and Systems 610
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 纳米技术 有机化学 物理 生物化学 化学工程 计算机科学 复合材料 内科学 催化作用 光电子学 物理化学 电极 冶金 遗传学 细胞生物学
热门帖子
关注 科研通微信公众号,转发送积分 6025230
求助须知:如何正确求助?哪些是违规求助? 7661153
关于积分的说明 16178620
捐赠科研通 5173393
什么是DOI,文献DOI怎么找? 2768188
邀请新用户注册赠送积分活动 1751589
关于科研通互助平台的介绍 1637669