Large Language Models for Diagnosing Focal Liver Lesions From CT/MRI Reports: A Comparative Study With Radiologists

医学诊断 组织病理学 回顾性队列研究 医学 鉴别诊断 放射科 磁共振成像 病理
作者
Liuji Sheng,Yidi Chen,Hong Wei,Feng Che,Yingyi Wu,Qin Qin,Chongtu Yang,Yanshu Wang,Jingwen Peng,Mustafa R. Bashir,Maxime Ronot,Bin Song,Hanyu Jiang
出处
期刊:Liver International [Wiley]
卷期号:45 (6) 被引量:3
标识
DOI:10.1111/liv.70115
摘要

ABSTRACT Background & Aims Whether large language models (LLMs) could be integrated into the diagnostic workflow of focal liver lesions (FLLs) remains unclear. We aimed to investigate two generic LLMs (ChatGPT‐4o and Gemini) regarding their diagnostic accuracies referring to the CT/MRI reports, compared to and combined with radiologists of different experience levels. Methods From April 2022 to April 2024, this single‐center retrospective study included consecutive adult patients who underwent contrast‐enhanced CT/MRI for single FLL and subsequent histopathologic examination. The LLMs were prompted by clinical information and the “findings” section of radiology reports three times to provide differential diagnoses in the descending order of likelihood, with the first considered the final diagnosis. In the research setting, six radiologists (three junior and three middle‐level) independently reviewed the CT/MRI images and clinical information in two rounds (first alone, then with LLM assistance). In the clinical setting, diagnoses were retrieved from the “impressions” section of radiology reports. Diagnostic accuracy was investigated against histopathology. Results 228 patients (median age, 59 years; 155 males) with 228 FLLs (median size, 3.6 cm) were included. Regarding the final diagnosis, the accuracy of two‐step ChatGPT‐4o (78.9%) was higher than single‐step ChatGPT‐4o (68.0%, p < 0.001) and single‐step Gemini (73.2%, p = 0.004), similar to real‐world radiology reports (80.0%, p = 0.34) and junior radiologists (78.9%–82.0%; p ‐values, 0.21 to > 0.99), but lower than middle‐level radiologists (84.6%–85.5%; p ‐values, 0.001 to 0.02). No incremental diagnostic value of ChatGPT‐4o was observed for any radiologist ( p ‐values, 0.63 to > 0.99). Conclusion Two‐step ChatGPT‐4o showed matching accuracies to real‐world radiology reports and junior radiologists for diagnosing FLLs but was less accurate than middle‐level radiologists and demonstrated little incremental diagnostic value.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
zty发布了新的文献求助10
1秒前
1秒前
麦辣堡发布了新的文献求助10
1秒前
中杯西瓜冰完成签到,获得积分10
2秒前
木头完成签到,获得积分10
2秒前
2秒前
量子星尘发布了新的文献求助10
3秒前
苹果绿发布了新的文献求助10
3秒前
wxy发布了新的文献求助10
3秒前
Y哦莫哦莫完成签到,获得积分10
5秒前
xxr发布了新的文献求助10
6秒前
6秒前
7秒前
彭于晏应助zhixian采纳,获得30
7秒前
ii发布了新的文献求助10
8秒前
8秒前
酷波er应助深情新之采纳,获得10
9秒前
9秒前
wxy完成签到,获得积分10
9秒前
10秒前
10秒前
小宸完成签到,获得积分10
10秒前
11秒前
12秒前
JamesPei应助你好采纳,获得10
12秒前
13秒前
xxxllllll完成签到,获得积分10
13秒前
summer完成签到,获得积分10
13秒前
尹小末发布了新的文献求助10
14秒前
亦雪发布了新的文献求助10
15秒前
wyblobin完成签到,获得积分10
15秒前
科研通AI5应助苹果绿采纳,获得10
15秒前
充电宝应助张俊敏采纳,获得10
15秒前
123完成签到,获得积分10
15秒前
共享精神应助biozhp采纳,获得10
16秒前
夜轩岚发布了新的文献求助10
16秒前
16秒前
fjj发布了新的文献求助10
16秒前
Harden发布了新的文献求助10
16秒前
量子星尘发布了新的文献求助10
17秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Handbook of Milkfat Fractionation Technology and Application, by Kerry E. Kaylegian and Robert C. Lindsay, AOCS Press, 1995 1000
The Social Work Ethics Casebook(2nd,Frederic G. R) 600
A novel angiographic index for predicting the efficacy of drug-coated balloons in small vessels 500
Textbook of Neonatal Resuscitation ® 500
The Affinity Designer Manual - Version 2: A Step-by-Step Beginner's Guide 500
Affinity Designer Essentials: A Complete Guide to Vector Art: Your Ultimate Handbook for High-Quality Vector Graphics 500
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 内科学 生物化学 物理 计算机科学 纳米技术 遗传学 基因 复合材料 化学工程 物理化学 病理 催化作用 免疫学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 5074229
求助须知:如何正确求助?哪些是违规求助? 4294374
关于积分的说明 13381128
捐赠科研通 4115792
什么是DOI,文献DOI怎么找? 2253873
邀请新用户注册赠送积分活动 1258494
关于科研通互助平台的介绍 1191343