Comparing large language models and human doctors in symptom-driven online medical consultations: A case study on trigeminal neuralgia

作者
Lirui Duan,Zhong Yao,X. Li,Yanfei Wu,Dongfang Sheng
出处
期刊:Digital health [SAGE Publishing]
卷期号:11: 20552076251388140-20552076251388140
标识
DOI:10.1177/20552076251388140
摘要

Objective To evaluate the performance of generative AI tools, specifically Ernie Bot and ChatGPT, in supporting online medical consultations in China, focusing on their accuracy, safety, and empathy, and to assess their potential role in addressing the supply-demand gap in the healthcare system. Methods We collected 233 trigeminal neuralgia consultations from a Chinese medical platform, including patient questions and doctor replies. Each question was input into ChatGPT-3.5 and Ernie Bot with role-specific prompts to generate large language models (LLMs) responses. Four blinded raters—two doctors and two patients—evaluated all responses using DISCERN and a modified PEMAT. Lexical, syntactic, and semantic analyses were conducted, with Spearman correlations assessing links between linguistic features and perceived quality. Results While doctors led in reliability, Ernie Bot scored highest overall, especially in empathy and clarity, likely due to stylistic choices rather than true understanding. Despite their fluency, LLMs remain prone to factual errors. Text analysis showed distinct linguistic patterns, with several features significantly correlated with perceived quality. Conclusion LLMs demonstrate strengths in perceived empathy and clarity but fall short in clinical accuracy and depth when addressing complex cases. Although they outperform doctors in communication-related aspects, their limitations in high-risk decision-making remain evident. As such, LLMs hold promise as adjunct tools for non-urgent consultations, but further refinement is needed to meet the standards of precise and personalized healthcare delivery.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
优雅的涵瑶完成签到,获得积分20
1秒前
1秒前
1秒前
JJ_fly完成签到,获得积分10
2秒前
2秒前
传奇3应助斑驳采纳,获得10
2秒前
2秒前
小马驹发布了新的文献求助10
3秒前
cwn完成签到,获得积分10
3秒前
科研通AI6.2应助精明金毛采纳,获得10
3秒前
Twonej给LEO的求助进行了留言
3秒前
科研小白完成签到,获得积分20
3秒前
Twonej应助马登采纳,获得30
3秒前
4秒前
踏实尔阳发布了新的文献求助10
4秒前
4秒前
万能图书馆应助s长清采纳,获得10
4秒前
一秒啊发布了新的文献求助10
5秒前
5秒前
852应助那咋办嘛采纳,获得10
5秒前
3399发布了新的文献求助10
6秒前
6秒前
小李发布了新的文献求助10
7秒前
keyanbaicai发布了新的文献求助10
8秒前
强者发布了新的文献求助10
9秒前
JamesPei应助踏实尔阳采纳,获得10
9秒前
阔达棉花糖完成签到 ,获得积分10
9秒前
将炎完成签到,获得积分10
9秒前
xia发布了新的文献求助10
10秒前
10秒前
白象牙关注了科研通微信公众号
10秒前
烟花应助曾峥采纳,获得10
10秒前
kagaminelen发布了新的文献求助10
10秒前
11秒前
脑洞疼应助N多个采纳,获得10
11秒前
星辰大海应助Literaturecome采纳,获得10
11秒前
11秒前
万能图书馆应助一秒啊采纳,获得10
12秒前
研友_VZG7GZ应助弯一弯采纳,获得10
12秒前
12秒前
高分求助中
液晶指向矢仿真分析数据集 8888
GL 2 A method for assessing the in-place cleanability of food processing equipment, Fourth Edition, December 2023 3000
Invited Discussant 63O and 64O 1000
Ideology and Meaning-Making under the Putin Regime 750
Advanced Memory Technology 500
Petrology and Plate Tectonics 500
Writing Systems 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 计算机科学 化学工程 生物化学 物理 内科学 复合材料 催化作用 光电子学 物理化学 电极 细胞生物学 基因 遗传学
热门帖子
关注 科研通微信公众号,转发送积分 6862666
求助须知:如何正确求助?哪些是违规求助? 8565814
关于积分的说明 18214724
捐赠科研通 6229748
什么是DOI,文献DOI怎么找? 3048165
关于科研通互助平台的介绍 2048870
邀请新用户注册赠送积分活动 2025799