亲爱的研友该休息了!由于当前在线用户较少,发布求助请尽量完整地填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!身体可是革命的本钱,早点休息,好梦!

Performance of ChatGPT on Ophthalmology-Related Questions Across Various Examination Levels: Observational Study

医学 美国医学执照考试 眼科 执照 医学院 医学教育 家庭医学
作者
Firas Haddad,Joanna S Saade
出处
期刊:JMIR medical education [JMIR Publications]
卷期号:10: e50842-e50842 被引量:2
标识
DOI:10.2196/50842
摘要

ChatGPT and language learning models have gained attention recently for their ability to answer questions on various examinations across various disciplines. The question of whether ChatGPT could be used to aid in medical education is yet to be answered, particularly in the field of ophthalmology.The aim of this study is to assess the ability of ChatGPT-3.5 (GPT-3.5) and ChatGPT-4.0 (GPT-4.0) to answer ophthalmology-related questions across different levels of ophthalmology training.Questions from the United States Medical Licensing Examination (USMLE) steps 1 (n=44), 2 (n=60), and 3 (n=28) were extracted from AMBOSS, and 248 questions (64 easy, 122 medium, and 62 difficult questions) were extracted from the book, Ophthalmology Board Review Q&A, for the Ophthalmic Knowledge Assessment Program and the Board of Ophthalmology (OB) Written Qualifying Examination (WQE). Questions were prompted identically and inputted to GPT-3.5 and GPT-4.0.GPT-3.5 achieved a total of 55% (n=210) of correct answers, while GPT-4.0 achieved a total of 70% (n=270) of correct answers. GPT-3.5 answered 75% (n=33) of questions correctly in USMLE step 1, 73.33% (n=44) in USMLE step 2, 60.71% (n=17) in USMLE step 3, and 46.77% (n=116) in the OB-WQE. GPT-4.0 answered 70.45% (n=31) of questions correctly in USMLE step 1, 90.32% (n=56) in USMLE step 2, 96.43% (n=27) in USMLE step 3, and 62.90% (n=156) in the OB-WQE. GPT-3.5 performed poorer as examination levels advanced (P<.001), while GPT-4.0 performed better on USMLE steps 2 and 3 and worse on USMLE step 1 and the OB-WQE (P<.001). The coefficient of correlation (r) between ChatGPT answering correctly and human users answering correctly was 0.21 (P=.01) for GPT-3.5 as compared to -0.31 (P<.001) for GPT-4.0. GPT-3.5 performed similarly across difficulty levels, while GPT-4.0 performed more poorly with an increase in the difficulty level. Both GPT models performed significantly better on certain topics than on others.ChatGPT is far from being considered a part of mainstream medical education. Future models with higher accuracy are needed for the platform to be effective in medical education.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
22秒前
Copyright应助科研通管家采纳,获得10
23秒前
35秒前
1分钟前
灯光师发布了新的文献求助10
1分钟前
1分钟前
乔凌云发布了新的文献求助10
1分钟前
李爱国应助S1mple采纳,获得10
1分钟前
cdercder应助灯光师采纳,获得10
1分钟前
默默无闻完成签到 ,获得积分10
1分钟前
2分钟前
耶耶耶发布了新的文献求助10
2分钟前
2分钟前
灯光师完成签到,获得积分10
2分钟前
Copyright应助Rn采纳,获得10
3分钟前
4分钟前
司白奎完成签到 ,获得积分10
4分钟前
研友_VZG7GZ应助时尚的飞机采纳,获得10
4分钟前
司白奎完成签到 ,获得积分10
4分钟前
daixan89完成签到 ,获得积分10
5分钟前
令尊是我犬子完成签到 ,获得积分10
5分钟前
5分钟前
6分钟前
S1mple发布了新的文献求助10
6分钟前
6分钟前
完美世界应助S1mple采纳,获得10
6分钟前
英俊的铭应助时尚的飞机采纳,获得30
6分钟前
今后应助Noob_saibot采纳,获得10
7分钟前
7分钟前
DChen完成签到 ,获得积分10
7分钟前
7分钟前
Noob_saibot完成签到,获得积分10
7分钟前
7分钟前
7分钟前
Noob_saibot发布了新的文献求助10
7分钟前
8分钟前
S1mple发布了新的文献求助10
8分钟前
思源应助科研通管家采纳,获得10
8分钟前
8分钟前
8分钟前
高分求助中
液晶指向矢仿真分析数据集 8888
GL 2 A method for assessing the in-place cleanability of food processing equipment, Fourth Edition, December 2023 3000
Ideology and Meaning-Making under the Putin Regime 750
Annie Ernaux: De la perte au corps glorieux 600
Petrology and Plate Tectonics 500
Writing Systems 500
A Handbook of User Experience Research & Design in Libraries 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6850625
求助须知:如何正确求助?哪些是违规求助? 8556918
关于积分的说明 18199049
捐赠科研通 6208362
什么是DOI,文献DOI怎么找? 3043739
关于科研通互助平台的介绍 2038526
邀请新用户注册赠送积分活动 2021194