移情
冲程(发动机)
心理学
社会经济地位
康复
生成语法
物理医学与康复
医学
物理疗法
人工智能
精神科
计算机科学
工程类
机械工程
环境卫生
人口
作者
John Tayu Lee,Valerie Li,Jia-Jyun Wu,Hsin‐Hsi Chen,Sheng‐Fang Su,Brian Pin-Hsuan Chang,Richard Lee Lai,Chi-Hung Liu,Chung-Ting Chen,Valis Tanapima,Toby Kai-Bo Shen,Rifat Atun
标识
DOI:10.1038/s41746-025-01830-9
摘要
Abstract Stroke is a leading cause of global morbidity and mortality, disproportionately impacting lower socioeconomic groups. In this study, we evaluated three generative LLMs—GPT, Claude, and Gemini—across four stages of stroke care: prevention, diagnosis, treatment, and rehabilitation. Using three prompt engineering techniques—Zero-Shot Learning (ZSL), Chain of Thought (COT), and Talking Out Your Thoughts (TOT)—we applied each to realistic stroke scenarios. Clinical experts assessed the outputs across five domains: (1) accuracy; (2) hallucinations; (3) specificity; (4) empathy; and (5) actionability, based on clinical competency benchmarks. Overall, the LLMs demonstrated suboptimal performance with inconsistent scores across domains. Each prompt engineering method showed strengths in specific areas: TOT does well in empathy and actionability, COT was strong in structured reasoning during diagnosis, and ZSL provided concise, accurate responses with fewer hallucinations, especially in the Treatment stage. However, none consistently met high clinical standards across all stroke care stages.
科研通智能强力驱动
Strongly Powered by AbleSci AI