Large Language Models for Automated Synoptic Reports and Resectability Categorization in Pancreatic Cancer

医学 分类 癌症 放射科 胰腺癌 自然语言处理 医学物理学 人工智能 肿瘤科 普通外科 内科学 计算机科学
作者
Rajesh Bhayana,Bipin Nanda,Taher Dehkharghanian,Yangqing Deng,Nishaant Bhambra,Gavin J.B. Elias,Daksh Datta,Avinash Kambadakone,Chaya Shwaartz,Carol‐Anne Moulton,David Henault,Steven Gallinger,Satheesh Krishna
出处
期刊:Radiology [Radiological Society of North America]
卷期号:311 (3): e233117-e233117 被引量:65
标识
DOI:10.1148/radiol.233117
摘要

Background Structured radiology reports for pancreatic ductal adenocarcinoma (PDAC) improve surgical decision-making over free-text reports, but radiologist adoption is variable. Resectability criteria are applied inconsistently. Purpose To evaluate the performance of large language models (LLMs) in automatically creating PDAC synoptic reports from original reports and to explore performance in categorizing tumor resectability. Materials and Methods In this institutional review board-approved retrospective study, 180 consecutive PDAC staging CT reports on patients referred to the authors' European Society for Medical Oncology-designated cancer center from January to December 2018 were included. Reports were reviewed by two radiologists to establish the reference standard for 14 key findings and National Comprehensive Cancer Network (NCCN) resectability category. GPT-3.5 and GPT-4 (accessed September 18-29, 2023) were prompted to create synoptic reports from original reports with the same 14 features, and their performance was evaluated (recall, precision, F1 score). To categorize resectability, three prompting strategies (default knowledge, in-context knowledge, chain-of-thought) were used for both LLMs. Hepatopancreaticobiliary surgeons reviewed original and artificial intelligence (AI)-generated reports to determine resectability, with accuracy and review time compared. The McNemar test, t test, Wilcoxon signed-rank test, and mixed effects logistic regression models were used where appropriate. Results GPT-4 outperformed GPT-3.5 in the creation of synoptic reports (F1 score: 0.997 vs 0.967, respectively). Compared with GPT-3.5, GPT-4 achieved equal or higher F1 scores for all 14 extracted features. GPT-4 had higher precision than GPT-3.5 for extracting superior mesenteric artery involvement (100% vs 88.8%, respectively). For categorizing resectability, GPT-4 outperformed GPT-3.5 for each prompting strategy. For GPT-4, chain-of-thought prompting was most accurate, outperforming in-context knowledge prompting (92% vs 83%, respectively; P = .002), which outperformed the default knowledge strategy (83% vs 67%, P < .001). Surgeons were more accurate in categorizing resectability using AI-generated reports than original reports (83% vs 76%, respectively; P = .03), while spending less time on each report (58%; 95% CI: 0.53, 0.62). Conclusion GPT-4 created near-perfect PDAC synoptic reports from original reports. GPT-4 with chain-of-thought achieved high accuracy in categorizing resectability. Surgeons were more accurate and efficient using AI-generated reports. © RSNA, 2024 Supplemental material is available for this article. See also the editorial by Chang in this issue.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
土豆完成签到,获得积分10
刚刚
科研通AI6.3应助NVDYE采纳,获得10
刚刚
瘦瘦瘦发布了新的文献求助30
刚刚
isss完成签到,获得积分10
1秒前
LIJIngcan发布了新的文献求助10
1秒前
浮游应助ji采纳,获得10
1秒前
大力的灵雁应助ji采纳,获得10
1秒前
doller完成签到,获得积分0
2秒前
neuroman发布了新的文献求助10
2秒前
2秒前
jinhuanghuiyu发布了新的文献求助10
2秒前
无名氏应助FceEar采纳,获得10
3秒前
liusong完成签到,获得积分10
3秒前
cly发布了新的文献求助10
3秒前
4秒前
Williams完成签到,获得积分20
4秒前
4秒前
和谐夕阳完成签到,获得积分10
4秒前
SCL987654321完成签到,获得积分10
5秒前
5秒前
6秒前
6秒前
6秒前
我是老大应助森森采纳,获得10
6秒前
GRATE完成签到 ,获得积分10
7秒前
mumu完成签到,获得积分10
7秒前
Grace完成签到 ,获得积分10
8秒前
和平小鸽发布了新的文献求助10
8秒前
8秒前
xilin完成签到,获得积分10
9秒前
neuroman完成签到,获得积分10
10秒前
星辰大海应助阿鑫采纳,获得10
10秒前
TGU发布了新的文献求助10
10秒前
兔子发布了新的文献求助10
10秒前
Strawberry发布了新的文献求助30
10秒前
Kairos_Duan发布了新的文献求助10
11秒前
安青梅完成签到 ,获得积分10
11秒前
JamesPei应助吴快快采纳,获得10
11秒前
11秒前
Jasper应助沉静灵枫采纳,获得10
11秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Inorganic Chemistry Eighth Edition 1200
Free parameter models in liquid scintillation counting 1000
Anionic polymerization of acenaphthylene: identification of impurity species formed as by-products 1000
Standards for Molecular Testing for Red Cell, Platelet, and Neutrophil Antigens, 7th edition 1000
HANDBOOK OF CHEMISTRY AND PHYSICS 106th edition 1000
ASPEN Adult Nutrition Support Core Curriculum, Fourth Edition 1000
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6310913
求助须知:如何正确求助?哪些是违规求助? 8127207
关于积分的说明 17029354
捐赠科研通 5368409
什么是DOI,文献DOI怎么找? 2850402
邀请新用户注册赠送积分活动 1828029
关于科研通互助平台的介绍 1680654