Development and Evaluation of a Retrieval-Augmented Large Language Model Framework for Ophthalmology

医学 基线(sea) 排名(信息检索) 梅德林 医疗保健 医学物理学 人工智能 计算机科学 政治学 法学 海洋学 经济增长 经济 地质学
作者
Mingjie Luo,Jianyu Pang,Shaowei Bi,Yunxi Lai,Jiaman Zhao,Yuanrui Shang,Tingxin Cui,Yahan Yang,Zhenzhe Lin,Lanqin Zhao,Xiaohang Wu,Duoru Lin,Jingjing Chen,Haotian Lin
出处
期刊:JAMA Ophthalmology [American Medical Association]
卷期号:142 (9): 798-798 被引量:38
标识
DOI:10.1001/jamaophthalmol.2024.2513
摘要

Importance Although augmenting large language models (LLMs) with knowledge bases may improve medical domain–specific performance, practical methods are needed for local implementation of LLMs that address privacy concerns and enhance accessibility for health care professionals. Objective To develop an accurate, cost-effective local implementation of an LLM to mitigate privacy concerns and support their practical deployment in health care settings. Design, Setting, and Participants ChatZOC (Sun Yat-Sen University Zhongshan Ophthalmology Center), a retrieval-augmented LLM framework, was developed by enhancing a baseline LLM with a comprehensive ophthalmic dataset and evaluation framework (CODE), which includes over 30 000 pieces of ophthalmic knowledge. This LLM was benchmarked against 10 representative LLMs, including GPT-4 and GPT-3.5 Turbo (OpenAI), across 300 clinical questions in ophthalmology. The evaluation, involving a panel of medical experts and biomedical researchers, focused on accuracy, utility, and safety. A double-masked approach was used to try to minimize bias assessment across all models. The study used a comprehensive knowledge base derived from ophthalmic clinical practice, without directly involving clinical patients. Exposures LLM response to clinical questions. Main Outcomes and Measures Accuracy, utility, and safety of LLMs in responding to clinical questions. Results The baseline model achieved a human ranking score of 0.48. The retrieval-augmented LLM had a score of 0.60, a difference of 0.12 (95% CI, 0.02-0.22; P = .02) from baseline and not different from GPT-4 with a score of 0.61 (difference = 0.01; 95% CI, −0.11 to 0.13; P = .89). For scientific consensus, the retrieval-augmented LLM was 84.0% compared with the baseline model of 46.5% (difference = 37.5%; 95% CI, 29.0%-46.0%; P < .001) and not different from GPT-4 with a value of 79.2% (difference = 4.8%; 95% CI, −0.3% to 10.0%; P = .06). Conclusions and Relevance Results of this quality improvement study suggest that the integration of high-quality knowledge bases improved the LLM’s performance in medical domains. This study highlights the transformative potential of augmented LLMs in clinical practice by providing reliable, safe, and practical clinical information. Further research is needed to explore the broader application of such frameworks in the real world.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
Lucas应助活力的傲菡采纳,获得10
1秒前
英俊的铭应助klh采纳,获得10
2秒前
2秒前
Ehgnix完成签到,获得积分10
2秒前
闪闪新梅完成签到,获得积分10
2秒前
cui完成签到,获得积分10
2秒前
YUE完成签到,获得积分10
2秒前
Carolin完成签到,获得积分10
3秒前
3秒前
3秒前
朴实寻真发布了新的文献求助10
3秒前
微尘完成签到,获得积分10
3秒前
4秒前
4秒前
vicki发布了新的文献求助30
4秒前
英姑应助xh采纳,获得10
5秒前
5秒前
静迹发布了新的文献求助10
5秒前
明镜发布了新的文献求助10
6秒前
慕青应助爱听歌笑寒采纳,获得10
6秒前
科研通AI6.1应助杨_采纳,获得10
6秒前
zhutae发布了新的文献求助10
6秒前
英姑应助从容的方盒采纳,获得10
6秒前
wanci应助停停走走采纳,获得10
7秒前
8秒前
威武的汉堡完成签到,获得积分10
8秒前
qiyun完成签到,获得积分10
9秒前
风趣的凡发布了新的文献求助10
10秒前
10秒前
Augreen完成签到,获得积分10
11秒前
liang发布了新的文献求助10
11秒前
11秒前
小小发布了新的文献求助20
11秒前
研研研究不出完成签到 ,获得积分10
12秒前
12秒前
12秒前
椰子完成签到,获得积分10
12秒前
今麦郎发布了新的文献求助10
12秒前
甜美香之完成签到 ,获得积分10
13秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Modern Epidemiology, Fourth Edition 5000
Handbook of pharmaceutical excipients, Ninth edition 5000
Digital Twins of Advanced Materials Processing 2000
Weaponeering, Fourth Edition – Two Volume SET 2000
Polymorphism and polytypism in crystals 1000
Signals, Systems, and Signal Processing 610
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 纳米技术 有机化学 生物化学 化学工程 物理 计算机科学 复合材料 内科学 催化作用 物理化学 光电子学 电极 冶金 基因 遗传学
热门帖子
关注 科研通微信公众号,转发送积分 6022567
求助须知:如何正确求助?哪些是违规求助? 7642904
关于积分的说明 16169707
捐赠科研通 5170857
什么是DOI,文献DOI怎么找? 2766894
邀请新用户注册赠送积分活动 1750200
关于科研通互助平台的介绍 1636934