Generative artificial intelligence chatbots may provide appropriate informational responses to common vascular surgery questions by patients

医学 可读性 血管外科 外科 心脏外科 语言学 哲学
作者
Ethan Chervonski,Keerthi Harish,Caron B. Rockman,Mikel Sadek,Katherine Teter,Glenn R. Jacobowitz,Todd Berland,Joann M. Lohr,Colleen M. Moore,Thomas S. Maldonado
出处
期刊:Vascular [SAGE]
标识
DOI:10.1177/17085381241240550
摘要

Objectives Generative artificial intelligence (AI) has emerged as a promising tool to engage with patients. The objective of this study was to assess the quality of AI responses to common patient questions regarding vascular surgery disease processes. Methods OpenAI’s ChatGPT-3.5 and Google Bard were queried with 24 mock patient questions spanning seven vascular surgery disease domains. Six experienced vascular surgery faculty at a tertiary academic center independently graded AI responses on their accuracy (rated 1–4 from completely inaccurate to completely accurate), completeness (rated 1–4 from totally incomplete to totally complete), and appropriateness (binary). Responses were also evaluated with three readability scales. Results ChatGPT responses were rated, on average, more accurate than Bard responses (3.08 ± 0.33 vs 2.82 ± 0.40, p < .01). ChatGPT responses were scored, on average, more complete than Bard responses (2.98 ± 0.34 vs 2.62 ± 0.36, p < .01). Most ChatGPT responses (75.0%, n = 18) and almost half of Bard responses (45.8%, n = 11) were unanimously deemed appropriate. Almost one-third of Bard responses (29.2%, n = 7) were deemed inappropriate by at least two reviewers (29.2%), and two Bard responses (8.4%) were considered inappropriate by the majority. The mean Flesch Reading Ease, Flesch–Kincaid Grade Level, and Gunning Fog Index of ChatGPT responses were 29.4 ± 10.8, 14.5 ± 2.2, and 17.7 ± 3.1, respectively, indicating that responses were readable with a post-secondary education. Bard’s mean readability scores were 58.9 ± 10.5, 8.2 ± 1.7, and 11.0 ± 2.0, respectively, indicating that responses were readable with a high-school education ( p < .0001 for three metrics). ChatGPT’s mean response length (332 ± 79 words) was higher than Bard’s mean response length (183 ± 53 words, p < .001). There was no difference in the accuracy, completeness, readability, or response length of ChatGPT or Bard between disease domains ( p > .05 for all analyses). Conclusions AI offers a novel means of educating patients that avoids the inundation of information from “Dr Google” and the time barriers of physician-patient encounters. ChatGPT provides largely valid, though imperfect, responses to myriad patient questions at the expense of readability. While Bard responses are more readable and concise, their quality is poorer. Further research is warranted to better understand failure points for large language models in vascular surgery patient education.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
于暖暖完成签到,获得积分10
刚刚
耿聪完成签到 ,获得积分10
刚刚
刚刚
1秒前
CipherSage应助binbin采纳,获得10
1秒前
云正青岩完成签到 ,获得积分10
1秒前
1秒前
laser13发布了新的文献求助10
2秒前
3秒前
3秒前
曾经觅珍发布了新的文献求助10
4秒前
积极废人完成签到,获得积分10
4秒前
wmx发布了新的文献求助10
4秒前
7秒前
Akim应助Tine采纳,获得10
8秒前
曼夭非夭完成签到,获得积分10
8秒前
8秒前
闫伯涵发布了新的文献求助10
9秒前
9秒前
CBWKEYANTONG123完成签到,获得积分10
10秒前
小西发布了新的文献求助20
10秒前
11秒前
lalala完成签到,获得积分10
11秒前
小颖发布了新的文献求助30
11秒前
牧长一完成签到 ,获得积分0
11秒前
无花果应助rr采纳,获得10
12秒前
12秒前
王文茹发布了新的文献求助20
12秒前
13秒前
14秒前
盼盼完成签到,获得积分10
14秒前
16秒前
17秒前
18秒前
19秒前
19秒前
20秒前
大壮_0808完成签到,获得积分10
20秒前
包容的飞丹完成签到,获得积分10
20秒前
21秒前
高分求助中
Manual of Clinical Microbiology, 4 Volume Set (ASM Books) 13th Edition 1000
Chinese-English Translation Lexicon Version 3.0 500
Electronic Structure Calculations and Structure-Property Relationships on Aromatic Nitro Compounds 500
マンネンタケ科植物由来メロテルペノイド類の網羅的全合成/Collective Synthesis of Meroterpenoids Derived from Ganoderma Family 500
[Lambert-Eaton syndrome without calcium channel autoantibodies] 400
Statistical Procedures for the Medical Device Industry 400
Workbook for Organic Synthesis: Strategy and Control 300
热门求助领域 (近24小时)
化学 材料科学 医学 生物 有机化学 工程类 生物化学 纳米技术 物理 内科学 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 电极 光电子学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 2379282
求助须知:如何正确求助?哪些是违规求助? 2086377
关于积分的说明 5237527
捐赠科研通 1813395
什么是DOI,文献DOI怎么找? 904969
版权声明 558681
科研通“疑难数据库(出版商)”最低求助积分说明 483108