Development and Evaluation of a Retrieval-Augmented Large Language Model Framework for Ophthalmology

医学 基线(sea) 排名(信息检索) 梅德林 医疗保健 医学物理学 人工智能 计算机科学 政治学 法学 海洋学 经济增长 经济 地质学
作者
Mingjie Luo,Jianyu Pang,Shaowei Bi,Yunxi Lai,Jiaman Zhao,Yuanrui Shang,Tingxin Cui,Yahan Yang,Zhenzhe Lin,Lanqin Zhao,Xiaohang Wu,Duoru Lin,Jingjing Chen,Haotian Lin
出处
期刊:JAMA Ophthalmology [American Medical Association]
卷期号:142 (9): 798-798 被引量:38
标识
DOI:10.1001/jamaophthalmol.2024.2513
摘要

Importance Although augmenting large language models (LLMs) with knowledge bases may improve medical domain–specific performance, practical methods are needed for local implementation of LLMs that address privacy concerns and enhance accessibility for health care professionals. Objective To develop an accurate, cost-effective local implementation of an LLM to mitigate privacy concerns and support their practical deployment in health care settings. Design, Setting, and Participants ChatZOC (Sun Yat-Sen University Zhongshan Ophthalmology Center), a retrieval-augmented LLM framework, was developed by enhancing a baseline LLM with a comprehensive ophthalmic dataset and evaluation framework (CODE), which includes over 30 000 pieces of ophthalmic knowledge. This LLM was benchmarked against 10 representative LLMs, including GPT-4 and GPT-3.5 Turbo (OpenAI), across 300 clinical questions in ophthalmology. The evaluation, involving a panel of medical experts and biomedical researchers, focused on accuracy, utility, and safety. A double-masked approach was used to try to minimize bias assessment across all models. The study used a comprehensive knowledge base derived from ophthalmic clinical practice, without directly involving clinical patients. Exposures LLM response to clinical questions. Main Outcomes and Measures Accuracy, utility, and safety of LLMs in responding to clinical questions. Results The baseline model achieved a human ranking score of 0.48. The retrieval-augmented LLM had a score of 0.60, a difference of 0.12 (95% CI, 0.02-0.22; P = .02) from baseline and not different from GPT-4 with a score of 0.61 (difference = 0.01; 95% CI, −0.11 to 0.13; P = .89). For scientific consensus, the retrieval-augmented LLM was 84.0% compared with the baseline model of 46.5% (difference = 37.5%; 95% CI, 29.0%-46.0%; P < .001) and not different from GPT-4 with a value of 79.2% (difference = 4.8%; 95% CI, −0.3% to 10.0%; P = .06). Conclusions and Relevance Results of this quality improvement study suggest that the integration of high-quality knowledge bases improved the LLM’s performance in medical domains. This study highlights the transformative potential of augmented LLMs in clinical practice by providing reliable, safe, and practical clinical information. Further research is needed to explore the broader application of such frameworks in the real world.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
立追拓发布了新的文献求助10
1秒前
1秒前
1秒前
1秒前
zsp关闭了zsp文献求助
2秒前
土豆发布了新的文献求助10
2秒前
灵巧灵完成签到,获得积分10
3秒前
3秒前
友好依风发布了新的文献求助10
3秒前
脑洞疼应助迷你的鸿涛采纳,获得10
3秒前
4秒前
4秒前
泡儿夫发布了新的文献求助10
5秒前
LEO发布了新的文献求助10
6秒前
FP干涉仪发布了新的文献求助10
6秒前
xiong完成签到,获得积分10
6秒前
HSZ完成签到,获得积分10
7秒前
灵巧灵发布了新的文献求助10
7秒前
rurui完成签到,获得积分20
7秒前
LMW发布了新的文献求助10
7秒前
张琪发布了新的文献求助30
8秒前
黄不愁完成签到,获得积分10
9秒前
9秒前
oneday发布了新的文献求助10
9秒前
9秒前
qinqin完成签到,获得积分10
10秒前
10秒前
10秒前
10秒前
凡凡发布了新的文献求助10
11秒前
11秒前
桂WGH发布了新的文献求助10
12秒前
木棉完成签到,获得积分10
12秒前
华仔应助rurui采纳,获得10
13秒前
尊敬的雨竹完成签到,获得积分10
13秒前
无花果应助孙涛采纳,获得10
13秒前
Akim应助按时毕业采纳,获得10
13秒前
13秒前
文刀刘完成签到 ,获得积分10
14秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Modern Epidemiology, Fourth Edition 5000
Digital Twins of Advanced Materials Processing 2000
Weaponeering, Fourth Edition – Two Volume SET 2000
Polymorphism and polytypism in crystals 1000
Signals, Systems, and Signal Processing 610
Discrete-Time Signals and Systems 610
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 纳米技术 有机化学 物理 生物化学 化学工程 计算机科学 复合材料 内科学 催化作用 光电子学 物理化学 电极 冶金 遗传学 细胞生物学
热门帖子
关注 科研通微信公众号,转发送积分 6024802
求助须知:如何正确求助?哪些是违规求助? 7658291
关于积分的说明 16177432
捐赠科研通 5173140
什么是DOI,文献DOI怎么找? 2767963
邀请新用户注册赠送积分活动 1751385
关于科研通互助平台的介绍 1637577