OpenAI’s ChatGPT and Its Role in Plastic Surgery Research

医学整形外科普通外科外科

作者

Allan A. Weidman,Lauren Valentine,Kevin C. Chung,Samuel J. Lin

出处

期刊：Plastic and Reconstructive Surgery [Ovid Technologies (Wolters Kluwer)]
日期：2023-04-26 卷期号：151 (5): 1111-1113

链接

lww.com nih.govdoi.org

标识

DOI：10.1097/prs.0000000000010342

摘要

Artificial intelligence (AI), which once existed solely in science fiction, has arrived as a part of our daily lives. In the past year, various AI models have been released for public use, with sweeping ramifications. One company in particular, OpenAI, has offered two popular AI models, DALL-E and ChatGPT, to the public for free. DALL-E is an artificial intelligence model that produces original computer-generated images, while ChatGPT generates text while interacting with users in a conversational way.1 For example, a user might say, “Hi ChatGPT,” and ChatGPT might write back, “Hello, how are you today?” Or a user could say, “ChatGPT, write an essay on deep inferior epigastric perforator (DIEP) flaps,” and ChatGPT would respond within seconds with a well-written, knowledgeable-sounding essay about DIEP flaps. Interestingly, ChatGPT was not simply created all at once but needed to be trained using what OpenAI calls “reinforcement learning from human feedback.”1 During this training, ChatGPT was exposed to a wide array of human text and information to give it the base of knowledge needed to write such an essay about DIEP flaps. Given these wide-ranging capabilities, the model has quickly gained massive popularity. After being launched on November 30, 2022, the research release of ChatGPT reached 1 million users in just 5 days.2 In comparison, Facebook took approximately 10 months to reach 1 million users.3 Although the model is currently free to the public, the service’s founder has already stated that eventually efforts will be made to generate revenue from the model. In addition, OpenAI recently strengthened its partnership with Microsoft after receiving a multibillion-dollar investment and will likely continue to expand.4 Innovative uses of ChatGPT have been documented across a variety of fields, including medicine. In one example, a physician shared a TikTok video of himself using the AI tool to quickly generate prior authorization letters, albeit with significant editing required.5 In another scenario, a mental health app used ChatGPT to help provide mental health support to approximately 4000 people.5 This experiment drew significant backlash, given the ethical implications of the study, but highlights a larger trend in which people report turning to AI as a source of therapy.6,7 Furthermore, ChatGPT was able to perform at a level comparable to a third-year medical student on National Board of Medical Examiners examinations and pass the United States Medical Licensing Examination Step examinations.8,9 Finally, in a much simpler example and as a demonstration of how ChatGPT functions, we asked the model to write a poem about the Plastic and Reconstructive Surgery journal using the command, “Write a poem about the Plastic and Reconstructive Surgery journal” (Fig. 1). Having drawn on its knowledge of poem conventions and the journal gained during its training, either inferred or explicitly taught, ChatGPT wrote a new poem that is completely original. It did not need to be trained on the topic of plastic and reconstructive surgery, nor did it require a program that is specifically designed to write poetry. If one were to ask ChatGPT to write the poem again using the same prompt, it would generate a different original poem altogether.Fig. 1.: Poem generated by ChatGPT.As with all emerging technology, ChatGPT has significant limitations. Primarily, the AI model can generate incorrect and misleading information while appearing confident about the veracity of the material. As a result, if not fact checked and edited rigorously, fallacious material can be unknowingly distributed by ChatGPT users. Accordingly, OpenAI’s website lists limitations for ChatGPT, the first of which is, “ChatGPT sometimes writes plausible-sounding but incorrect or nonsensical answers.”1 This was recently highlighted in a case report that asked 15 radiologists to assess ChatGPT’s ability to simplify radiology reports. While most reports were factually correct and complete, about half of the radiologists found statements in the generated reports with a “high potential of leading patients to the wrong conclusion” and various instances of errors that led to the exclusion of key medical findings.10 For example, ChatGPT often interpreted “differential diagnosis” as the final diagnosis, falsely leading patients to believe that they had certain conditions that were merely one of many possibilities. Another report falsely stated that there was no evidence of cancer spreading to other parts of the body, when evidence of pulmonary metastases was documented in the original report.10 Further, as it is not freely connected to the internet, ChatGPT’s knowledge base is limited to the information it was trained on (exposed to) during its creation; therefore, it is ignorant of current events and new advances.11 Similarly, as the model only draws knowledge from the human-generated information it was trained on, any biases present in that information may also be present in ChatGPT’s outputs. Thus, it does not necessarily deliver the right answer but rather answers based on the information that it acquired and inferred from its training. A simplified example is that if ChatGPT were exposed to a racist blog during its training, it may then generate racist dialogue when asked certain questions. Caliskan et al.12 highlighted this principle by using the Implicit Association Test to replicate a series of known human biases.13 The authors then demonstrated that machines can learn implicit associations that are exhibited by humans, and thus have the risk of perpetuating dangerous biases and stereotypes.12 OpenAI has taken efforts to address the issue of bias, but has acknowledged that it remains a problem, stating, “While we’ve made efforts to make the model refuse inappropriate requests, it will sometimes respond to harmful instructions or exhibit biased behavior.”1 The consequences of incorrect and biased material generated by AI platforms led to significant hardships for another AI model called Galactica, which was created by Facebook’s parent company, Meta. Galactica is a language model similar to ChatGPT that was trained using scientific literature. Following its public release in mid-November of 2022, however, it was quickly removed from public access days later after it faced criticism for presenting biased and incorrect information as fact.14 As Meta works to reform the model, Galactica serves as an ongoing reminder of the limitations of other AI tools similar to ChatGPT. Given its utility and despite its flaws, ChatGPT has been used in scientific research and has subsequently become a hotly debated topic among peer-reviewed journals. ChatGPT has been applied to prepare manuscripts, and in several cases, it was even listed as an author.15 As a result, one study investigated whether medical researchers (ie, potential editors) could identify AI-generated abstracts from human-generated abstracts. The study found human reviewers believed incorrectly that 32% of AI-generated texts had been written by humans.16 Thus, medicine and science stand at an impasse, with the ethics of utilizing AI models such as ChatGPT to write scientific text still not clearly defined. However, multiple publishers and journals have agreed that ChatGPT and other AI models do not meet the legal requirements of an author.15 Technology is wonderful as a tool. There are aspects of human communication that should be left to human beings. We will need to see over time how natural language processing is or is not incorporated into life. The exercise of writing causes writers to deeply examine and to reflect on their data analysis and their message globally. Plastic and Reconstructive Surgery holds the position that the use of ChatGPT and other AI models should not be encouraged, nor should they be included as an author on scientific articles. Authors must use extreme caution and judicious ethical discretion when engaging ChatGPT or other AI services to generate academic writing or other portions of their study. Authors are personally liable for any mistakes made by the tool but reported by them as fact. If AI tools are used, authors should disclose their use in the acknowledgments on the title page, as opposed to delegating authorship. Finally, investigators must work closely with their institutional review boards to ensure that the use of AI in their research is ethical, especially with regard to security of patient information. DISCLOSURE The authors have no financial interest to declare in relation to the content of this article.

求助该文献

最长约 10秒，即可获得该文献文件

OpenAI’s ChatGPT and Its Role in Plastic Surgery Research

今日热心研友