CACER: Clinical concept Annotations for Cancer Events and Relations

计算机科学 人工智能 自然语言处理 癌症 情报检索 数据科学 医学 内科学
作者
Yujuan Fu,Giridhar Kaushik Ramachandran,Ahmad Halwani,Bridget T. McInnes,Fei Xia,Kevin Lybarger,Meliha Yetişgen,Özlem Uzuner
出处
期刊:Journal of the American Medical Informatics Association [Oxford University Press]
标识
DOI:10.1093/jamia/ocae231
摘要

Clinical notes contain unstructured representations of patient histories, including the relationships between medical problems and prescription drugs. To investigate the relationship between cancer drugs and their associated symptom burden, we extract structured, semantic representations of medical problem and drug information from the clinical narratives of oncology notes. We present Clinical Concept Annotations for Cancer Events and Relations (CACER), a novel corpus with fine-grained annotations for over 48,000 medical problems and drug events and 10,000 drug-problem and problem-problem relations. Leveraging CACER, we develop and evaluate transformer-based information extraction (IE) models such as BERT, Flan-T5, Llama3, and GPT-4 using fine-tuning and in-context learning (ICL). In event extraction, the fine-tuned BERT and Llama3 models achieved the highest performance at 88.2-88.0 F1, which is comparable to the inter-annotator agreement (IAA) of 88.4 F1. In relation extraction, the fine-tuned BERT, Flan-T5, and Llama3 achieved the highest performance at 61.8-65.3 F1. GPT-4 with ICL achieved the worst performance across both tasks. The fine-tuned models significantly outperformed GPT-4 in ICL, highlighting the importance of annotated training data and model optimization. Furthermore, the BERT models performed similarly to Llama3. For our task, LLMs offer no performance advantage over the smaller BERT models. The results emphasize the need for annotated training data to optimize models. Multiple fine-tuned transformer models achieved performance comparable to IAA for several extraction tasks.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
建议保存本图,每天支付宝扫一扫(相册选取)领红包
实时播报
刚刚
1秒前
香蕉觅云应助lpf采纳,获得10
1秒前
yangxue完成签到,获得积分10
2秒前
hai完成签到,获得积分10
2秒前
SYSUer发布了新的文献求助10
3秒前
3秒前
丘比特应助友好晓蓝采纳,获得10
3秒前
郑伟李完成签到,获得积分10
3秒前
卢西完成签到,获得积分10
3秒前
4秒前
4秒前
4秒前
一棵树莓发布了新的文献求助10
5秒前
科研通AI2S应助Yara.H采纳,获得10
6秒前
7秒前
摩天轮完成签到 ,获得积分10
7秒前
单薄麦片发布了新的文献求助10
7秒前
FashionBoy应助自信的若风采纳,获得10
7秒前
NexusExplorer应助Wiesen采纳,获得10
8秒前
8秒前
易安发布了新的文献求助10
9秒前
push发布了新的文献求助10
9秒前
yaoli0823发布了新的文献求助30
9秒前
renhong发布了新的文献求助10
11秒前
11秒前
Jello发布了新的文献求助10
13秒前
霜序完成签到,获得积分10
13秒前
量子星尘发布了新的文献求助10
13秒前
15秒前
浮游应助花痴的平安采纳,获得10
15秒前
wa完成签到,获得积分10
15秒前
15秒前
15秒前
17秒前
陈炜smile完成签到,获得积分10
17秒前
17秒前
push完成签到,获得积分10
19秒前
JamesPei应助能干的烧鹅采纳,获得10
19秒前
lynn发布了新的文献求助30
20秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
List of 1,091 Public Pension Profiles by Region 1041
Mentoring for Wellbeing in Schools 1000
Binary Alloy Phase Diagrams, 2nd Edition 600
Atlas of Liver Pathology: A Pattern-Based Approach 500
A Technologist’s Guide to Performing Sleep Studies 500
EEG in Childhood Epilepsy: Initial Presentation & Long-Term Follow-Up 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 物理化学 基因 遗传学 催化作用 冶金 量子力学 光电子学
热门帖子
关注 科研通微信公众号,转发送积分 5492703
求助须知:如何正确求助?哪些是违规求助? 4590700
关于积分的说明 14431835
捐赠科研通 4523205
什么是DOI,文献DOI怎么找? 2478231
邀请新用户注册赠送积分活动 1463254
关于科研通互助平台的介绍 1436012