哮喘
计算机科学
医学
自然语言处理
心理学
内科学
作者
Y Liu,Fangfei Yu,X. Zhang,Xiaohan Tong,Kui Li,Weikuan Gu,Baiquan Yu
摘要
Abstract Background Asthma is a chronic inflammatory airway disease requiring long-term management. Artificial intelligence (AI)–driven tools such as large language models (LLMs) hold potential for enhancing patient education, especially for multilingual populations. However, comparative assessments of LLMs in disease-specific, bilingual health communication are limited. Objective This study aimed to evaluate and compare the performance of two advanced LLMs—ChatGPT-4o (OpenAI) and DeepSeek-v3 (DeepSeek AI)—in providing bilingual (English and Chinese) education for patients with asthma, focusing on accuracy, completeness, clinical relevance, and language adaptability. Methods A total of 53 asthma-related questions were collected from real patient inquiries across 8 clinical domains. Each question was posed in both English and Chinese to ChatGPT-4o and DeepSeek-v3. Responses were evaluated using a 7D clinical quality framework (eg, completeness, consensus consistency, and reasoning ability) adapted from Google Health. Three respiratory clinicians performed blinded scoring evaluations. Descriptive statistics and Wilcoxon signed-rank tests were applied to compare performance across domains and against theoretical maximums. Results Both models demonstrated high overall quality in generating bilingual educational content. DeepSeek-v3 outperformed ChatGPT-4o in completeness and currency, particularly in treatment-related knowledge and symptom interpretation. ChatGPT-4o showed advantages in clarity and accessibility. In English responses, ChatGPT achieved perfect scores across 5 domains, but scored lower in clinical features (mean 3.78, SD 0.16; P =.02), treatment (mean 3.90, SD 0.05; P =.03), and differential diagnosis (mean 3.83, SD 0.29; P =.08). Conclusions ChatGPT-4o and DeepSeek-v3 each offer distinct strengths for bilingual asthma education. While ChatGPT is more suitable for general health education due to its expressive clarity, DeepSeek provides more up-to-date and comprehensive clinical content. Both models can serve as effective supplementary tools for patient self-management but cannot replace professional medical advice. Future AI health care systems should enhance clinical reasoning, ensure guideline currency, and integrate human oversight to optimize safety and accuracy.
科研通智能强力驱动
Strongly Powered by AbleSci AI