可读性
医学
背景(考古学)
成对比较
威尔科克森符号秩检验
梅德林
标杆管理
阅读(过程)
医疗保健
患者安全
人工智能
计算机科学
古生物学
曼惠特尼U检验
经济
营销
业务
程序设计语言
法学
内科学
生物
经济增长
政治学
作者
Ariana Genovese,Sahar Borna,Cesar A. Gomez-Cabello,Syed Ali Haider,Srinivasagam Prabha,Maissa Trabilsy,Cui Tao,Keith T. Aziz,Peter M. Murray,Antonio J. Forte
摘要
Abstract Background While artificial intelligence (AI) is revolutionizing healthcare, inaccurate or incomplete information from pre-trained large language models (LLMs) like ChatGPT poses significant risks to patient safety. Retrieval-Augmented Generation (RAG) offers a promising solution by leveraging curated knowledge bases to enhance accuracy and reliability, especially in high-demand specialties like plastic surgery. Objectives This study evaluates the performance of RAG-enabled AI models in addressing postoperative rhinoplasty questions, aiming to assess their safety and identify necessary improvements for effective implementation into clinical care. Methods Four RAG models (Gemini-1.0-Pro-002, Gemini-1.5-Flash-001, Gemini-1.5-Pro-001, and PaLM 2) were tested on 30 common patient inquiries. Responses, sourced from authoritative rhinoplasty texts, were evaluated for accuracy (1–5 scale), comprehensiveness (1–3 scale), readability (Flesch Reading Ease, Flesch-Kincaid Grade Level), and understandability/actionability (Patient Education Materials Assessment Tool). Statistical analyses included Wilcoxon rank sum, Armitage trend tests, and pairwise comparisons. Results When responses were generated, they were generally accurate (41.7% completely accurate); however, a 30.8% nonresponse rate revealed potential challenges with query context interpretation and retrieval. Gemini-1.0-Pro-002 demonstrated superior comprehensiveness (p < 0.001), but readability (FRE: 40–49) and understandability (mean: 0.7) fell below patient education standards. PaLM 2 scored lowest in actionability (p < 0.007). Conclusions This first application of RAG to postoperative rhinoplasty patient care highlights its strengths in accuracy alongside its limitations, including nonresponse and contextual understanding. Addressing these challenges will enable safer, more effective implementation of RAG models across diverse surgical and medical contexts, with the potential to revolutionize patient care by reducing physician workload while enhancing patient engagement.
科研通智能强力驱动
Strongly Powered by AbleSci AI