Structured Report Generation for Breast Cancer Imaging Based on Large Language Modeling: A Comparative Analysis of GPT-4 and DeepSeek

麦克内马尔试验 医学 放射科 乳腺癌 乳腺摄影术 病变 一致性 乳房成像 癌症 医学物理学 内科学 病理 统计 数学
作者
Kun Chen,Xuefeng Hou,Xiaofeng Li,Wengui Xu,Heqing Yi
出处
期刊:Academic Radiology [Elsevier]
卷期号:32 (10): 5693-5702 被引量:5
标识
DOI:10.1016/j.acra.2025.07.046
摘要

The purpose of this study is to compare the performance of GPT-4 and DeepSeek large language models in generating structured breast cancer multimodality imaging integrated reports from free-text radiology reports including mammography, ultrasound, MRI, and PET/CT. A retrospective analysis was conducted on 1358 free-text reports from 501 breast cancer patients across two institutions. The study design involved synthesizing multimodal imaging data into structured reports with three components: primary lesion characteristics, metastatic lesions, and TNM staging. Input prompts were standardized for both models, with GPT-4 using predesigned instructions and DeepSeek requiring manual input. Reports were evaluated based on physician satisfaction using a Likert scale, descriptive accuracy including lesion localization, size, SUV, and metastasis assessment, and TNM staging correctness according to NCCN guidelines. Statistical analysis included McNemar tests for binary outcomes and correlation analysis for multiclass comparisons with a significance threshold of P < .05. Physician satisfaction scores showed strong correlation between models with r-values of 0.665 and 0.558 and P-values below .001. Both models demonstrated high accuracy in data extraction and integration. The mean accuracy for primary lesion features was 91.7% for GPT-4% and 92.1% for DeepSeek, while feature synthesis accuracy was 93.4% for GPT4 and 93.9% for DeepSeek. Metastatic lesion identification showed comparable overall accuracy at 93.5% for GPT4 and 94.4% for DeepSeek. GPT-4 performed better in pleural lesion detection with 94.9% accuracy compared to 79.5% for DeepSeek, whereas DeepSeek achieved higher accuracy in mesenteric metastasis identification at 87.5% vs 43.8% for GPT4. TNM staging accuracy exceeded 92% for T-stage and 94% for M-stage, with N-stage accuracy improving beyond 90% when supplemented with physical exam data. Both GPT-4 and DeepSeek effectively generate structured breast cancer imaging reports with high accuracy in data mining, integration, and TNM staging. Integrating these models into clinical practice is expected to enhance report standardization and physician productivity.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
在水一方应助淡淡菠萝采纳,获得10
1秒前
深情不弱完成签到 ,获得积分10
1秒前
1秒前
Stanfuny完成签到,获得积分10
1秒前
1秒前
monoklatt完成签到,获得积分10
2秒前
離原完成签到,获得积分10
2秒前
整齐半青完成签到 ,获得积分10
2秒前
Running完成签到 ,获得积分10
3秒前
曾祥完成签到,获得积分10
4秒前
yuanfangyi0306完成签到,获得积分20
5秒前
6秒前
XTQ发布了新的文献求助10
6秒前
7秒前
量子星尘发布了新的文献求助30
8秒前
拾光完成签到,获得积分10
8秒前
米浆完成签到 ,获得积分10
8秒前
小幸运完成签到,获得积分10
9秒前
Super完成签到 ,获得积分10
10秒前
小狄发布了新的文献求助10
10秒前
ann完成签到,获得积分10
11秒前
瑜蛋完成签到 ,获得积分10
14秒前
68完成签到,获得积分10
14秒前
Joff_W完成签到,获得积分10
14秒前
李爱国应助哒哒哒采纳,获得10
15秒前
自由自在完成签到,获得积分10
16秒前
17秒前
香蕉觅云应助shukq采纳,获得10
19秒前
19秒前
yuncong323完成签到,获得积分10
19秒前
沉默是金完成签到,获得积分10
19秒前
zhangmengjiao完成签到,获得积分10
20秒前
机智的皮皮虾完成签到,获得积分10
20秒前
20秒前
22秒前
芥末发布了新的文献求助10
23秒前
无奈的代珊完成签到 ,获得积分10
23秒前
淡淡菠萝发布了新的文献求助10
24秒前
量子星尘发布了新的文献求助10
27秒前
wangyan发布了新的文献求助10
27秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Introduction to strong mixing conditions volume 1-3 5000
Clinical Microbiology Procedures Handbook, Multi-Volume, 5th Edition 2000
从k到英国情人 1500
Ägyptische Geschichte der 21.–30. Dynastie 1100
„Semitische Wissenschaften“? 1100
Real World Research, 5th Edition 800
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 计算机科学 有机化学 物理 生物化学 纳米技术 复合材料 内科学 化学工程 人工智能 催化作用 遗传学 数学 基因 量子力学 物理化学
热门帖子
关注 科研通微信公众号,转发送积分 5733231
求助须知:如何正确求助?哪些是违规求助? 5347351
关于积分的说明 15323400
捐赠科研通 4878359
什么是DOI,文献DOI怎么找? 2621189
邀请新用户注册赠送积分活动 1570317
关于科研通互助平台的介绍 1527219