SBDH-Reader: a large language model-powered method for extracting social and behavioral determinants of health from clinical notes

召回 计算机科学 混淆矩阵 集合(抽象数据类型) 混乱 心理干预 社会化媒体 人工智能 自然语言处理 机器学习 应用心理学 情报检索 医学 心理学 万维网 护理部 精神分析 认知心理学 程序设计语言
作者
Zifan Gu,Li He,Ayesha Naeem,Pui Man Chan,A. A. Mohamed,Heba M. A. Khalil,Yujia Guo,Jingwei Huang,Ismael Villanueva-Miranda,Ying Ding,Wenqi Shi,Matthew E. Dupre,Guanghua Xiao,Eric D. Peterson,Yang Xie,Ann Marie Návar,Donghan M. Yang
出处
期刊:Journal of the American Medical Informatics Association [Oxford University Press]
被引量:1
标识
DOI:10.1093/jamia/ocaf124
摘要

Abstract Objective Social and behavioral determinants of health (SBDH) are increasingly recognized as essential for prognostication and informing targeted interventions. Clinical notes often contain details about SBDH in unstructured format. Conventional extraction methods for these data tend to be labor intensive, inaccurate, and/or unscalable. In this study, we aim to develop and validate a large language model (LLM)-powered method to extract structured SBDH data from clinical notes through prompt engineering. Materials and Methods We developed SBDH-Reader to extract 6 categories of granular SBDH data by prompting GPT-4o, including employment, housing, marital status, and substance use including alcohol, tobacco, and drug use. SBDH-Reader was developed using 7225 notes from 6382 patients in the MIMIC-III database (2001–2012) and externally validated using 971 notes from 437 patients at The University of Texas Southwestern Medical Center (UTSW; 2022–2023). We evaluated SBDH-Reader’s performance against human-annotated ground truths based on precision, recall, F1, and confusion matrix. Results When tested on the UTSW validation set, SBDH-Reader achieved a macro-average F1 ranging from 0.94 to 0.98 across 6 SBDH categories. For clinically relevant adverse attributes, F1 ranged from 0.96 (employment; housing) to 0.99 (tobacco use). When extracting any adverse attributes across all SBDH categories, SBDH-Reader achieved an F1 of 0.97, recall of 0.97, and precision of 0.98 in the independent validation set. Discussion SBDH-Reader demonstrated strong performance in extracting structured SBDH data through effective prompt engineering of a general-purpose LLM, without the need for task-specific fine-tuning. Its modular design and adaptability to diverse datasets and documentation patterns support its applicability in real-world clinical settings. Conclusion SBDH-Reader has the potential to serve as a scalable and effective method for collecting real-time, patient-level SBDH data to support clinical research and care.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
脑洞疼应助饱满的靖易采纳,获得10
刚刚
Mic应助楼下太吵了采纳,获得10
1秒前
1秒前
1秒前
roar发布了新的文献求助10
1秒前
ZZ完成签到,获得积分10
2秒前
大方道消发布了新的文献求助10
2秒前
Akim应助xx采纳,获得10
2秒前
缓慢听筠完成签到,获得积分10
2秒前
丰富的雁玉完成签到,获得积分10
3秒前
3秒前
lcz发布了新的文献求助10
3秒前
桐桐应助内向以彤采纳,获得10
3秒前
桐桐应助kuankuan采纳,获得30
3秒前
4秒前
4秒前
5秒前
动听的恋风完成签到 ,获得积分10
5秒前
5秒前
5秒前
花生爱发文完成签到,获得积分10
5秒前
嘻嘻完成签到,获得积分10
6秒前
axiba发布了新的文献求助10
6秒前
科研通AI6.4应助Ciil采纳,获得10
6秒前
6秒前
6秒前
梁溪公主完成签到,获得积分20
7秒前
CC完成签到,获得积分10
7秒前
7秒前
whc121发布了新的文献求助10
7秒前
8秒前
8秒前
共享精神应助丰富的雁玉采纳,获得10
8秒前
molihuakai应助young采纳,获得10
8秒前
aliderichang发布了新的文献求助10
8秒前
9秒前
简化为完成签到,获得积分10
9秒前
9秒前
9秒前
xiao完成签到,获得积分10
10秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Picture this! Including first nations fiction picture books in school library collections 2000
The Cambridge History of China: Volume 4, Sui and T'ang China, 589–906 AD, Part Two 1500
Cowries - A Guide to the Gastropod Family Cypraeidae 1200
Quality by Design - An Indispensable Approach to Accelerate Biopharmaceutical Product Development 800
ON THE THEORY OF BIRATIONAL BLOWING-UP 666
Signals, Systems, and Signal Processing 610
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6391299
求助须知:如何正确求助?哪些是违规求助? 8206368
关于积分的说明 17369979
捐赠科研通 5444953
什么是DOI,文献DOI怎么找? 2878705
邀请新用户注册赠送积分活动 1855192
关于科研通互助平台的介绍 1698461