Use of a large language model with instruction‐tuning for reliable clinical frailty scoring

医学 可靠性(半导体) 脆弱性(计算) 卡帕 一致性(知识库) 比例(比率) 老年学 人工智能 计算机科学 功率(物理) 语言学 物理 哲学 计算机安全 量子力学
作者
Xiang Lee Jamie Kee,Gerald Gui Ren Sng,Daniel Yan Zheng Lim,Joshua Yi Min Tung,Hairil Rizal Abdullah,Anupama Roy Chowdury
出处
期刊:Journal of the American Geriatrics Society [Wiley]
被引量:3
标识
DOI:10.1111/jgs.19114
摘要

Abstract Background Frailty is an important predictor of health outcomes, characterized by increased vulnerability due to physiological decline. The Clinical Frailty Scale (CFS) is commonly used for frailty assessment but may be influenced by rater bias. Use of artificial intelligence (AI), particularly Large Language Models (LLMs) offers a promising method for efficient and reliable frailty scoring. Methods The study utilized seven standardized patient scenarios to evaluate the consistency and reliability of CFS scoring by OpenAI's GPT‐3.5‐turbo model. Two methods were tested: a basic prompt and an instruction‐tuned prompt incorporating CFS definition, a directive for accurate responses, and temperature control. The outputs were compared using the Mann–Whitney U test and Fleiss' Kappa for inter‐rater reliability. The outputs were compared with historic human scores of the same scenarios. Results The LLM's median scores were similar to human raters, with differences of no more than one point. Significant differences in score distributions were observed between the basic and instruction‐tuned prompts in five out of seven scenarios. The instruction‐tuned prompt showed high inter‐rater reliability (Fleiss' Kappa of 0.887) and produced consistent responses in all scenarios. Difficulty in scoring was noted in scenarios with less explicit information on activities of daily living (ADLs). Conclusions This study demonstrates the potential of LLMs in consistently scoring clinical frailty with high reliability. It demonstrates that prompt engineering via instruction‐tuning can be a simple but effective approach for optimizing LLMs in healthcare applications. The LLM may overestimate frailty scores when less information about ADLs is provided, possibly as it is less subject to implicit assumptions and extrapolation than humans. Future research could explore the integration of LLMs in clinical research and frailty‐related outcome prediction.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
PetersenGraph完成签到,获得积分10
刚刚
qtww完成签到 ,获得积分10
1秒前
Umar完成签到,获得积分10
1秒前
lmh完成签到,获得积分10
1秒前
JamesPei应助WXXXX采纳,获得10
1秒前
yellow完成签到,获得积分20
1秒前
1秒前
科研通AI2S应助无私的画卷采纳,获得30
1秒前
yijiexiao2002完成签到,获得积分10
1秒前
sun完成签到,获得积分10
2秒前
wanci应助FBSoos采纳,获得10
2秒前
nyfz2002发布了新的文献求助10
2秒前
jerry完成签到,获得积分10
3秒前
奋斗甜瓜发布了新的文献求助10
3秒前
cdercder应助元谷雪采纳,获得10
3秒前
沉默傲薇发布了新的文献求助10
3秒前
Alfrado发布了新的文献求助10
3秒前
顿顿完成签到,获得积分10
4秒前
tianj完成签到,获得积分10
4秒前
qxm完成签到,获得积分10
4秒前
林勇德完成签到,获得积分10
4秒前
科研真凡发布了新的文献求助10
5秒前
现实的傲珊完成签到,获得积分10
5秒前
5秒前
5秒前
吉吉国王完成签到 ,获得积分10
7秒前
mxs完成签到,获得积分10
7秒前
8秒前
自由怀梦发布了新的文献求助10
8秒前
情怀应助圈圈采纳,获得10
8秒前
香蕉觅云应助qxm采纳,获得10
8秒前
8秒前
zz完成签到,获得积分10
11秒前
舒心的冰香发布了新的文献求助100
11秒前
11秒前
壮观的哈密瓜完成签到,获得积分10
11秒前
11秒前
dogsday完成签到,获得积分0
11秒前
qwf完成签到,获得积分10
12秒前
Echo完成签到,获得积分10
12秒前
高分求助中
Clinical Epidemiology: The Essentials, 6e 10000
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
The Graphene Handbook (2019 Edition) 800
Adhesion Science: Principles & Practice 800
Signals, Systems, and Signal Processing 610
IEST-RP-CC018: Cleanroom Cleaning and Sanitization: Operating and Monitoring Procedures 600
Fundamentals of Pharmaceutical and Biologics Regulations: A Global Perspective, Second Edition 600
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6534917
求助须知:如何正确求助?哪些是违规求助? 8328180
关于积分的说明 17841791
捐赠科研通 5636553
什么是DOI,文献DOI怎么找? 2934614
邀请新用户注册赠送积分活动 1910857
关于科研通互助平台的介绍 1769279