Large Language Model Predicts Surgeon Recommendations for Imaging and Surgery for Patients Presenting for Knee and Shoulder Complaints With 70% and 81% Accuracy Using Previsit Questionnaire Responses

医学骨科手术肩袖物理疗法弯月面运动医学前交叉韧带关节镜检查介绍肩关节手术队列外科放射科医学影像学回顾性队列研究体格检查前交叉韧带重建术梅德林循证医学队列研究医学物理学缺血性坏死关节置换术

作者

Ryan T. Halvorson,Timothy Keeley,Kian Niknam,Travis Zack,S. Majumdar,B. Feeley,Alan L Zhang,Drew A. Lansdown

出处

期刊：Arthroscopy [Elsevier BV]
日期：2026-01-01 卷期号：42 (1): 185-193 被引量：1

链接

nih.govdoi.org

标识

DOI：10.1002/arj.70016

摘要

PURPOSE: To validate the performance of a pretrained large language model (LLM) in predicting orthopaedic surgeon recommendations for management of newly referred patients, using free-text previsit questionnaire responses as input. METHODS: This retrospective cross-sectional study included new patients visiting an orthopaedic sports medicine clinic between 2020 and 2023. Using zero-shot prompting, the LLM analyzed previsit questionnaire responses (e.g., "When did you start to have pain?") to predict whether patients required advanced imaging and/or surgical intervention. The LLM was blinded to all other clinical information, including surgeon notes, physical exams, or referral data. Model predictions were evaluated with accuracy, sensitivity, and specificity in comparison to actual surgeon-generated plans. For a subset of patients who had undergone advanced imaging, the LLM was augmented with free-text radiology reports and asked to provide updated surgical recommendations. RESULTS: In the combined cohort of 1141 patients, the LLM predicted surgeon recommendation for advanced imaging with 70% accuracy, 83% sensitivity, and 64% specificity using previsit questionnaire responses alone. Imaging predictions were accurate for common diagnoses, including anterior cruciate ligament (ACL, 94%), meniscus (85%), and rotator cuff (80%) injuries but poor for knee (54%) and shoulder arthritis (66%). When augmented with imaging reports, the LLM predicted recommendations for surgery with 81% accuracy, 88% sensitivity, and 72% specificity. Surgical predictions were highly accurate for ACL (93%), meniscus (78%), rotator cuff (83%), and shoulder instability related pathologies (78%). CONCLUSIONS: Using previsit questionnaire data from new orthopaedic patients with knee and shoulder complaints, the pretrained LLM showed 70% accuracy for imaging recommendations, and the augmented surgical-decision LLM showed 81% accuracy for surgical recommendations. LEVEL OF EVIDENCE: Level III, retrospective diagnostic case-control study.

求助该文献

最长约 10秒，即可获得该文献文件

Large Language Model Predicts Surgeon Recommendations for Imaging and Surgery for Patients Presenting for Knee and Shoulder Complaints With 70% and 81% Accuracy Using Previsit Questionnaire Responses

今日热心研友