计算机科学
可读性
越南语
自然语言处理
印地语
万维网
医学
语言学
哲学
程序设计语言
作者
Bradley D. Menz,Natansh D. Modi,Ahmad Y. Abuhelwa,Warit Ruanglertboon,Agnès Vitry,Yuan Gao,Lee X. Li,Rakchha Chhetri,Benjamin Chu,Stephen Bacchi,Ganessan Kichenadasse,Adel Shahnam,Andrew Rowland,Michael J. Sorich,Ashley M. Hopkins
标识
DOI:10.1016/j.ejca.2025.115274
摘要
Recent advancements in large language models (LLMs) enable real-time web search, improved referencing, and multilingual support, yet ensuring they provide safe health information remains crucial. This perspective evaluates seven publicly accessible LLMs-ChatGPT, Co-Pilot, Gemini, MetaAI, Claude, Grok, Perplexity-on three simple cancer-related queries across eight languages (336 responses: English, French, Chinese, Thai, Hindi, Nepali, Vietnamese, and Arabic). None of the 42 English responses contained clinically meaningful hallucinations, whereas 7 of 294 non-English responses did. 48 % (162/336) of responses included valid references, but 39 % of the English references were.com links reflecting quality concerns. English responses frequently exceeded an eighth-grade level, and many non-English outputs were also complex. These findings reflect substantial progress over the past 2-years but reveal persistent gaps in multilingual accuracy, reliable reference inclusion, referral practices, and readability. Ongoing benchmarking is essential to ensure LLMs safely support global health information dichotomy and meet online information standards.
科研通智能强力驱动
Strongly Powered by AbleSci AI