Addressing the Novel Implications of Generative AI for Academic Publishing, Education, and Research

出版 生成语法 高等教育 医学教育 图书馆学 心理学 社会学 数据科学 计算机科学 医学 政治学 人工智能 法学
作者
Laura Weiss Roberts
出处
期刊:Academic Medicine [Lippincott Williams & Wilkins]
卷期号:99 (5): 471-473 被引量:5
标识
DOI:10.1097/acm.0000000000005667
摘要

Editor's Note: The opinions expressed in this editorial do not necessarily reflect the opinions of the AAMC or its members. Generative artificial intelligence (GenAI), such as ChatGPT, has dramatically changed academic publishing, education, and research in a very short time. Novel issues have arisen for the developers of GenAI, as well as for authors, researchers, editors, reviewers, readers, and the public.1,2 The idea that GenAI, like all tools, could be misused was anticipated, as have been many of the legal, financial, and technical challenges associated with artificial intelligence (AI).3–6 But only with broad use of GenAI tools have new, really complex, and truly unexpected phenomena arisen. For instance, the observation that GenAI would invent or fabricate data and references that it deemed "should" exist—i.e., that GenAI would hallucinate (Dictionary.com's 2023 Word of the Year7 to acknowledge GenAI's profound ramifications)—caught us all by surprise.8,9 The literature on the use of AI tools in academic scholarship has rapidly expanded, and the findings are humbling. As one example, Májovský et al10 performed a proof-of-concept study using ChatGPT to create a "highly convincing" but "completely fabricated article" in the field of neurosurgery. The authors were successful in generating, in 1 hour, a full manuscript that appeared "sophisticated and seemingly flawless" that included an "abstract, introduction, material and methods, discussion, references, charts, etc.," with 1,992 words and 17 citations. Only with careful review by experts from different disciplines were errors identifiable in the fabricated article. In another recently published report,11 linguists Casal and Kessler performed a structured interview study of journal reviewers (n = 72) and found that they "were largely unsuccessful in identifying AI versus human writing, with an overall positive identification rate of only 38.9%." Moreover, the use of AI itself in evaluating human-generated versus AI-generated scholarship has revealed mixed results thus far. Gao et al,12 for example, found that an AI detection tool based on the GPT-2 large language model greatly outperformed blinded human reviewers in discriminating between manuscript abstracts that were human-written and those generated by ChatGPT. In that same project, a plagiarism detector website rated human-written abstracts as more likely to have been plagiarized, as a greater percentage of matching text was found online. AI-generated abstracts were rated as being less similar to existing text. The mixed results emerging from this literature point to a need for further research on the possible merits and risks of using GenAI tools to enhance the writing process, as well as to develop robust scientific integrity safeguards. Academic Medicine proffered guidance to authors in 2023 regarding the use of AI tools in the preparation of manuscripts for our journal.13 We introduced a new policy for submissions that emphasized ethically salient aspects of accountability, disclosure, and transparency for authors engaging with AI tools.13 In keeping with the position of the Committee on Publication Ethics,14 we affirmed that AI tools must not be listed as authors. To serve as authors, individuals must have fulfilled 4 key requirements: contributing in a substantive manner to a manuscript, participating in the writing or revising of the work, providing approval for the final version of the work, and agreeing to be accountable publicly for all aspects of the work.15,16 Given the potential risks and rapid evolution of AI tools, we further highlighted the need for ongoing efforts by authors in ensuring the "accuracy, rigor, and integrity" of their scholarship.14 Working in parallel with our authors, the full editorial team of Academic Medicine committed to ongoing review and revision of the policies and practices of our journal "to align with the academic standards of our field."13 Considerations that we have begun to work through are wide-ranging, related to the integrity and accessibility of data sets, strengthened review procedures, appropriate and selective use of AI detection tools in evaluating submissions, stricter requirements for authors and reviewers, safeguards with editorial management and publisher software, added monitoring processes, and potential consequences for AI-generated submissions by authors who failed to disclose their use of these tools. Members of our editorial team are also engaged with colleagues throughout the field of academic medicine and academic publishing to study, assess, and address emerging ethical questions related to scientific integrity and misconduct in the new era of GenAI.17 Many of these ideas resonate with the recommendations advanced in a recent editors' statement18 on the responsible use of GenAI technologies in scholarly journal publishing. Those editors recommended that large language model GPT or other GenAI tools not be included as authors and that authors should be fully transparent about their use of AI tools. The editors provided further guidance related to editors' and reviewers' roles in evaluating scholarly submissions, including that editors should have "access to tools and strategies for ensuring authors' transparency," that editors and reviewers should not themselves "rely solely" on GenAI to review manuscript submissions, and that editors should "retain full responsibility" for selecting reviewers and overseeing the review process. The last recommendation was that the ultimate responsibility for editing a manuscript resides with "human authors and editors." Creating additional safeguard practices and clearer policies should help to lessen the likelihood that journals inadvertently publish scholarly works that have been fabricated or are otherwise fraudulent due to the use of GenAI. As noted by Májovský et al,10 several measures are needed to reduce this risk across scholarly publishing: providing source data sets; establishing rigorous review procedures; creating ethical regulations for publishers and academic institutions; and having adverse consequences, such as temporary or permanent bans from publishing with certain journals for researchers found to have engaged in misconduct. Journals, publishers, professional societies, scholars, and other stakeholders will need to consider such ideas carefully—and quickly. An analysis by Lee et al19 examined the views and policies in July 2023 of the 50 leading journals in one specialty of medicine (i.e., radiology) and found that 45% did not include guidance regarding the use of AI as part of their submission guidelines for authors. Most (82%) of the journals that did share guidance deferred to the policies of a large publishing group, suggesting that specialty-specific issues will require additional thought. In this illustration drawn from the field of radiology, the question of verifying the authenticity of images that might be AI-generated will be particularly salient. The imperative for greater understanding of the applications and societal implications of AI is clear. An Executive Order20 from the White House issued in October 2023 declared the need for society "to foster capabilities for identifying and labeling synthetic content produced by AI systems, and to establish the authenticity and provenance of digital content." The order also initiated an effort to study and document "issues that may hinder the effective use of AI in research and practices needed to ensure that AI is used responsibly for research." In academic medicine, the need for greater AI literacy is also apparent. Recognizing the need for physicians-in-training to have greater competence in medical AI, Lee et al21 used an iterative Delphi method with broad engagement of experts, faculty, and students from medical schools across South Korea to derive 6 broad domains of medical AI competencies. Four of the domains were identified as essential for medically trained graduates to understand: digital health and changes driven by AI, fundamental knowledge and skills in medical AI, the ethics and legal aspects in the use of medical AI, and medical AI application in clinical practice. Two other domains, viewed as important but optional, were (1) processing, analyzing, and evaluating medical data and (2) undertaking research and development of medical AI. Taken together, these 6 domains encompass 36 specific competencies and subcompetencies. This report promises to be helpful as medical educators accelerate their efforts to develop curricula that are responsive to the increasing role and accelerating use of medical AI. How AI methods are used widely in health professions education, and the potential benefits and problems, are thoughtfully described by Patino et al22 in this issue of the journal. In terms of advantages, those authors suggest that AI tools have the potential to complete certain data-related tasks with less direct human effort and in less time. These possible advantages were mentioned in the Innovation Report by Laupichler et al23 also appearing in this issue. In that report, researchers compared the performances by medical student volunteers (n = 161) on multiple-choice examination questions developed by humans and by ChatGPT. They found that the questions were similarly difficult but that questions developed by human authors had a significantly higher discriminatory power and thus were better able to differentiate student test performances. The underlying reasons for this statistical result regarding greater discriminatory power of human-created questions remain unclear—are experienced educators better at developing more salient and more valid questions, or was the result due to a technical issue with GenAI tools that, with time, could be resolved? Interestingly, the student volunteers were able to identify with 57% accuracy whether the questions were created by human or ChatGPT sources. GenAI tools have the risk of perpetuating or amplifying bias, as noted by Patino et al and others.22,24 Bias may result from how the underlying algorithms were developed, trained, and deployed.25 Such biases may greatly affect the interpretation and application of findings and run the risk of generating misinformation or disinformation.22 For these reasons, the authors identify trustworthiness as a crucial issue, particularly when the task at hand is "high stakes," e.g., shapes clinical care recommendations, as has been shown by Kasun et al.24 In health professions education, evaluating trainee performance is similarly "high stakes" and the need for trustworthiness is paramount. For these reasons, Patino et al conclude that AI methods should be engaged carefully and considered as part of a large repertoire of techniques (e.g., biostatistics) in health professions education: AI methods are not magical or infallible, and their use requires thoughtful reflection. They should be seen as tools, complementing the other resources and skills that faculty and researchers already possess.22 In making this argument, Patino et al continue to place the responsibility for trustworthiness on the shoulders of faculty members and researchers—human beings who serve in the field of academic medicine and fulfill the obligations of our profession. Ensuring that AI is used ethically to advance the salutary aims of academic medicine certainly entails trustworthy actions of faculty members and researchers. As the surprising and potent consequences of ever-widening use of AI have already taught us, however, it will take much, much more than individual efforts to safeguard against mischief, misuse, and misconduct related to applications of AI, especially in high-stakes activities in clinical care and clinical training. It will take dedicated commitment of all leaders and stakeholders of our field, working together and fully aware that potential consequences may lie far outside of our current imaginations.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
4秒前
cdercder应助科研通管家采纳,获得10
8秒前
cdercder应助科研通管家采纳,获得50
8秒前
cdercder应助科研通管家采纳,获得50
8秒前
十月完成签到 ,获得积分10
10秒前
害怕的冰颜完成签到 ,获得积分10
10秒前
藏锋完成签到 ,获得积分10
12秒前
荣幸完成签到 ,获得积分10
13秒前
奋斗诗云完成签到 ,获得积分10
14秒前
研友_ngk5zn发布了新的文献求助10
17秒前
往徕完成签到,获得积分10
20秒前
阳光的凡阳完成签到 ,获得积分10
21秒前
w0r1d完成签到 ,获得积分10
21秒前
李李李完成签到 ,获得积分10
24秒前
otto12306完成签到,获得积分10
24秒前
ramia完成签到 ,获得积分10
27秒前
高挑的冰露完成签到 ,获得积分10
36秒前
Hello应助无端采纳,获得10
37秒前
花卷是我完成签到 ,获得积分10
39秒前
江三村完成签到 ,获得积分0
41秒前
luckweb发布了新的文献求助10
42秒前
文艺的熠彤完成签到,获得积分10
42秒前
何跑跑完成签到,获得积分10
45秒前
无尘完成签到 ,获得积分10
48秒前
可爱小天才完成签到 ,获得积分10
51秒前
luckweb完成签到,获得积分10
54秒前
审核中完成签到,获得积分10
56秒前
孙刚完成签到 ,获得积分10
1分钟前
yeeja完成签到 ,获得积分10
1分钟前
CastorOil完成签到 ,获得积分10
1分钟前
盛事不朽完成签到 ,获得积分0
1分钟前
Reader完成签到 ,获得积分10
1分钟前
新手完成签到 ,获得积分10
1分钟前
沈惠映完成签到 ,获得积分10
1分钟前
hi_traffic完成签到,获得积分10
1分钟前
春风顺我意关注了科研通微信公众号
1分钟前
无心的钢笔完成签到 ,获得积分10
1分钟前
1分钟前
龙在天涯完成签到,获得积分0
1分钟前
拼搏的念文完成签到 ,获得积分10
1分钟前
高分求助中
Annie Ernaux: De la perte au corps glorieux 600
Petrology and Plate Tectonics,2025 500
Optical Coating Design with the Essential Macleod 400
A revision of Limenitis helmanni and its related species (Nymphalidae) from Central and South China 400
Moore's Clinically Oriented Anatomy 10th Edition 400
Direct and Iterative Linear System Solvers 400
Cardiopulmonary Bypass and Mechanical Support: Principles and Practice, Fifth Edition 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6781809
求助须知:如何正确求助?哪些是违规求助? 8504254
关于积分的说明 18112043
捐赠科研通 6084530
什么是DOI,文献DOI怎么找? 3018660
邀请新用户注册赠送积分活动 1995571
关于科研通互助平台的介绍 1980180