发布文献求助

The Evaluation of Generative AI Should Include Repetition to Assess Stability. (Preprint)

生成语法稳健性（进化）计算机科学重复（修辞手法）可靠性预印本理论（学习稳定性）可靠性（半导体）随机性领域（数学）人工智能数据科学机器学习基因法学纯数学政治学万维网量子力学数学统计物理功率（物理）哲学语言学化学生物化学

作者

Lingxuan Zhu,Weiming Mou,Chenglin Hong,Yang Tao,Y. F. Lai,Qi Chen,Anqi Lin,Jian Zhang,Peng Luo

出处

期刊：Jmir mhealth and uhealth [JMIR Publications]
日期：2024-03-01

链接

amazonaws.com amazonaws.com nih.govdoi.org

标识

DOI：10.2196/57978

摘要

The increasing interest in the potential applications of generative AI models like ChatGPT-3.5 in healthcare has prompted numerous studies exploring its performance in various medical contexts. However, evaluating ChatGPT poses unique challenges due to the inherent randomness in its responses. Unlike traditional AI models, ChatGPT generates different responses for the same input, making it imperative to assess its stability through repetition. This commentary highlights the importance of including repetition in the evaluation of ChatGPT to ensure the reliability of conclusions drawn from its performance. Similar to biological experiments, which often require multiple repetitions for validity, we argue that assessing generative AI models like ChatGPT demands a similar approach. Failure to acknowledge the impact of repetition can lead to biased conclusions and undermine the credibility of research findings. We urge researchers to incorporate appropriate repetition in their studies from the outset and transparently report their methods to enhance the robustness and reproducibility of findings in this rapidly evolving field.

求助该文献

最长约 10秒，即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI

我的文献求助列表浏览历史

一分钟了解求助规则 | 捐赠本站 | 历史今天

活动

『应助活动周』获奖名单已公布 🔥 (2025-4-2)

更新

『中科院2025期刊分区』已更新 (2025-3-23)

更新

『即时热点』模块已上线 (2025-2-28)

科研通是完全免费的文献互助平台，具备全网最快的应助速度，最高的求助完成率。对每一个文献求助，科研通都将尽心尽力，给求助人一个满意的交代。

实时播报: 洪对对发布了新的文献求助10

刚刚; 英俊的铭上传了应助文件

刚刚; xiaoxiaozhu发布了新的文献求助30

刚刚; 科研小菜鸡发布了新的文献求助10

1秒前; Yang发布了新的文献求助10

1秒前; kong发布了新的文献求助10

1秒前; 青年晚报发布了新的文献求助10

1秒前; lili发布了新的文献求助30

3秒前; ED的应助被晓塘采纳，获得30

4秒前; 个性的水风完成签到，获得积分20

4秒前; 绿麦盲区完成签到，获得积分10

4秒前; jenningseastera上传了应助文件

5秒前; 无奈半仙完成签到，获得积分10

5秒前; Ruiruirui发布了新的文献求助20

5秒前; PUTIDAXIAN完成签到，获得积分10

5秒前; 科研小菜鸡完成签到，获得积分10

7秒前; 科研通AI5的应助被xiaoxiaozhu采纳，获得10

8秒前; zhy完成签到，获得积分10

8秒前; sweet完成签到，获得积分10

9秒前; Yang完成签到，获得积分10

9秒前; 虚心的醉蓝完成签到，获得积分20

10秒前; 洪对对完成签到，获得积分10

10秒前; 乐乐的应助被无醇橙汁采纳，获得10

11秒前; 科研通AI2S的应助被个性的水风采纳，获得30

11秒前; 丘比特的应助被Ruiruirui采纳，获得10

12秒前; 111完成签到，获得积分10

13秒前; lxl1996发布了新的文献求助10

14秒前; 天天完成签到，获得积分10

16秒前; 清爽笑翠完成签到，获得积分10

16秒前; lili完成签到，获得积分20

17秒前; elizabeth339完成签到，获得积分10

17秒前; 英俊的铭的应助被sdl采纳，获得10

17秒前; 果汁橡皮糖完成签到，获得积分10

19秒前; 丘比特上传了应助文件

21秒前; 受伤纲完成签到，获得积分20

21秒前; 抹茶泡泡完成签到，获得积分10

23秒前; 英俊的铭上传了应助文件

24秒前; 乐乐上传了应助文件

24秒前; 酷波er的应助被WW采纳，获得10

26秒前; SLY完成签到，获得积分10

26秒前

高分求助中: Thinking Small and Large 500; Algorithmic Mathematics in Machine Learning 500; Handbook of Innovations in Political Psychology 400; Mapping the Stars: Celebrity, Metonymy, and the Networked Politics of Identity 400; Visceral obesity is associated with clinical and inflammatory features of asthma: A prospective cohort study 300; Getting Published in SSCI Journals: 200+ Questions and Answers for Absolute Beginners 300; Engineering the boosting of the magnetic Purcell factor with a composite structure based on nanodisk and ring resonators 240

热门求助领域（近24小时）

热门帖子: 关注科研通微信公众号，转发送积分 3838497; 求助须知：如何正确求助？哪些是违规求助？ 3380812; 关于积分的说明 10516014; 捐赠科研通 3100441; 什么是DOI，文献DOI怎么找？ 1707496; 邀请新用户注册赠送积分活动 821784; 科研通“疑难数据库（出版商）”最低求助积分说明 772947

今日热心研友

昏睡的蟠桃

小茄子爷爷

遇上就这样吧

可千万不要躺平呀

jenningseastera

注：热心度 = 本日应助数 + 本日被采纳获取积分÷10

Copyright © 2020-2025 AbleSci.COM, 科研通, All Right Reserved

科研通是非营利科研互助平台，不忘初心，为科研助力

本站互助的所有文件仅供个人学习研究用，禁止任何人把求助的所得文献进行盈利或传播

皖ICP备2024041134号-1

皖公网安备34019202002308

科研通【文献互助QQ群】：如果您有特殊求助，或发布求助超过24小时未得到应助，可加群求助，群号：941272744【点击一键加群】

科研通【志愿服务QQ群】：如果您热爱文献互助，有热心愿意为更多人服务，请加入小伙伴群，点击申请加入

关注微信服务号

科研通