Harnessing artificial intelligence in bariatric surgery: comparative analysis of ChatGPT-4, Bing, and Bard in generating clinician-level bariatric surgery recommendations

可读性 医学 利克特量表 外科 心理学 发展心理学 哲学 语言学
作者
Yung Lee,Thomas H. Shin,Léa Tessier,Arshia Javidan,James J. Jung,Dennis Hong,Andrew T. Strong,Tyler McKechnie,Sarah Malone,David Jin,Matthew Kroh,Jerry T. Dang
出处
期刊:Surgery for Obesity and Related Diseases [Elsevier BV]
卷期号:20 (7): 603-608 被引量:22
标识
DOI:10.1016/j.soard.2024.03.011
摘要

BackgroundThe formulation of clinical recommendations pertaining to bariatric surgery is essential in guiding healthcare professionals. However, the extensive and continuously evolving body of literature in bariatric surgery presents considerable challenge for staying abreast of latest developments and efficient information acquisition. Artificial intelligence (AI) has the potential to streamline access to the salient points of clinical recommendations in bariatric surgery.ObjectiveThe study aims to appraise the quality and readability of AI-chat-generated answers to frequently asked clinical inquiries in the field of bariatric and metabolic surgery.SettingRemote.MethodsQuestion prompts inputted into AI large language models (LLMs) were created based on pre-existing clinical practice guidelines regarding bariatric and metabolic surgery. The prompts were queried into three LLMs: OpenAI ChatGPT-4, Microsoft Bing, and Google Bard. The responses from each LLM were entered into a spreadsheet for randomized and blinded duplicate review. Accredited bariatric surgeons in North America independently assessed appropriateness of each recommendation using a 5-point Likert scale. Scores of 4 and 5 were deemed appropriate, while scores of 1 to 3 indicated a lack of appropriateness. A Flesch Reading Ease (FRE) score was calculated to assess the readability of responses generated by each LLMs.ResultsThere was a significant difference between the three LLMs in their 5-point Likert scores, with mean values of 4.46 (SD 0.82), 3.89 (0.80), and 3.11 (0.72) for ChatGPT-4, Bard, and Bing (P<0.001). There was a significant difference between the three LLMs in the proportion of appropriate answers, with ChatGPT-4 at 85.7%, Bard at 74.3%, and Bing at 25.7% (P<0.001). The mean FRE scores for ChatGPT-4, Bard, and Bing, were 21.68 (SD 2.78), 42.89 (4.03), and 14.64 (5.09), respectively, with higher scores representing easier readability.ConclusionLLM-based AI chat models can effectively generate appropriate responses to clinical questions related to bariatric surgery, though the performance of different models can vary greatly. Therefore, caution should be taken when interpreting clinical information provided by LLMs, and clinician oversight is necessary to ensure accuracy. Future investigation is warranted to explore how LLMs might enhance healthcare provision and clinical decision-making in bariatric surgery.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
2秒前
yuanll完成签到,获得积分10
2秒前
dddd完成签到,获得积分10
4秒前
缓慢手机发布了新的文献求助30
5秒前
郭星星完成签到,获得积分10
5秒前
zhenzhen完成签到,获得积分10
6秒前
飞鹏不会飞完成签到,获得积分10
7秒前
maggie发布了新的文献求助10
9秒前
CodeCraft应助qwjs采纳,获得10
10秒前
12秒前
郭星星发布了新的文献求助10
12秒前
羲月完成签到,获得积分10
13秒前
李健应助大闪电采纳,获得10
15秒前
刘小明完成签到,获得积分10
16秒前
馅饼完成签到,获得积分10
16秒前
duoduo完成签到,获得积分10
17秒前
Nancy发布了新的文献求助20
18秒前
Xenia完成签到 ,获得积分10
19秒前
钱烨华发布了新的文献求助20
21秒前
22秒前
萝卜脚踝完成签到,获得积分20
23秒前
23秒前
科研通AI5应助复杂念梦采纳,获得10
24秒前
26秒前
shuxue完成签到,获得积分10
27秒前
keke发布了新的文献求助20
27秒前
开朗以亦完成签到,获得积分10
28秒前
lemon完成签到,获得积分10
28秒前
毛毛妈完成签到,获得积分10
29秒前
嘀咕嘀咕发布了新的文献求助10
29秒前
ziliz完成签到,获得积分10
30秒前
Lucas应助科研通管家采纳,获得10
31秒前
脑洞疼应助科研通管家采纳,获得10
31秒前
爆米花应助科研通管家采纳,获得10
31秒前
烟花应助科研通管家采纳,获得30
31秒前
传奇3应助科研通管家采纳,获得10
31秒前
隐形曼青应助科研通管家采纳,获得10
32秒前
科研通AI2S应助科研通管家采纳,获得10
32秒前
32秒前
lilycat完成签到,获得积分10
32秒前
高分求助中
【此为提示信息,请勿应助】请按要求发布求助,避免被关 20000
Continuum Thermodynamics and Material Modelling 2000
Encyclopedia of Geology (2nd Edition) 2000
105th Edition CRC Handbook of Chemistry and Physics 1600
Maneuvering of a Damaged Navy Combatant 650
Периодизация спортивной тренировки. Общая теория и её практическое применение 310
Mixing the elements of mass customisation 300
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3779404
求助须知:如何正确求助?哪些是违规求助? 3324954
关于积分的说明 10220585
捐赠科研通 3040099
什么是DOI,文献DOI怎么找? 1668560
邀请新用户注册赠送积分活动 798721
科研通“疑难数据库(出版商)”最低求助积分说明 758522