Using GPT-4o for CAD-RADS feature extraction and categorization with free-text coronary CT Angiography reports (Preprint)

分类双雷达冠状动脉造影计算机辅助设计预印本特征（语言学）计算机科学医学放射科人工智能特征提取内科学乳腺摄影术工程类万维网语言学哲学癌症工程制图乳腺癌心肌梗塞

作者

Youmei Chen,Mengshi Dong,Jie Sun,Zhanao Meng,Yiqing Yang,Abudushalamu Muhetaier,Chao Li,Jie Qin

出处

期刊：JMIR medical informatics [JMIR Publications]
日期：2025-06-17 卷期号：13: e70967-e70967

链接

doi.org nih.govdoi.org

标识

DOI：10.2196/70967

摘要

Abstract Background Despite the Coronary Artery Reporting and Data System (CAD-RADS) providing a standardized approach, radiologists continue to favor free-text reports. This preference creates significant challenges for data extraction and analysis in longitudinal studies, potentially limiting large-scale research and quality assessment initiatives. Objective To evaluate the ability of the generative pre-trained transformer (GPT)-4o model to convert real-world coronary computed tomography angiography (CCTA) free-text reports into structured data and automatically identify CAD-RADS categories and P categories. Methods This retrospective study analyzed CCTA reports from January 2024 and July 2024. A subset of 25 reports was used for prompt engineering to instruct the large language models (LLMs) in extracting CAD-RADS categories, P categories, and the presence of myocardial bridges and noncalcified plaques. Reports were processed using the GPT-4o API (application programming interface) and custom Python scripts. The ground truth was established by radiologists based on the CAD-RADS 2.0 guidelines. Model performance was assessed using accuracy, sensitivity, specificity, and F 1 -score. Intrarater reliability was assessed using Cohen κ coefficient. Results Among 999 patients (median age 66 y, range 58‐74; 650 males), CAD-RADS categorization showed accuracy of 0.98‐1.00 (95% CI 0.9730‐1.0000), sensitivity of 0.95‐1.00 (95% CI 0.9191‐1.0000), specificity of 0.98‐1.00 (95% CI 0.9669‐1.0000), and F 1 -score of 0.96‐1.00 (95% CI 0.9253‐1.0000). P categories demonstrated accuracy of 0.97‐1.00 (95% CI 0.9569‐0.9990), sensitivity from 0.90 to 1.00 (95% CI 0.8085‐1.0000), specificity from 0.97 to 1.00 (95% CI 0.9533‐1.0000), and F 1 -score from 0.91 to 0.99 (95% CI 0.8377‐0.9967). Myocardial bridge detection achieved an accuracy of 0.98 (95% CI 0.9680‐0.9870), and noncalcified coronary plaques detection showed an accuracy of 0.98 (95% CI 0.9680‐0.9870). Cohen κ values for all classifications exceeded 0.98. Conclusions The GPT-4o model efficiently and accurately converts CCTA free-text reports into structured data, excelling in CAD-RADS classification, plaque burden assessment, and detection of myocardial bridges and calcified plaques.

求助该文献

Using GPT-4o for CAD-RADS feature extraction and categorization with free-text coronary CT Angiography reports (Preprint)

今日热心研友