拉什模型
生成语法
计算机科学
自然语言处理
人工智能
化学
数学教育
心理学
发展心理学
作者
Benjamin Sorenson,Kenneth Hanson
标识
DOI:10.1021/acs.jchemed.4c00165
摘要
Generative artificial intelligence (AI) technology is expected to have a profound impact on chemical education. While there are certainly positive uses, some of which are being actively implemented even now, there is a reasonable concern about its use in cheating. Efforts are underway to detect generative AI usage on open-ended questions, lab reports, and essays, but its detection on multiple choice exams is largely unexplored. Here we propose the use of Rasch analysis to identify the unique behavioral pattern of ChatGPT on General Chemistry II, multiple choice exams. While raw statistics (e.g., average, ability, outfit) were insufficient to readily identify ChatGPT instances, a strategy of fixing the ability scale on high success questions and then refitting the outcomes dramatically enhanced its outlier behavior in terms of Z-standardized out-fit statistic and ability displacement. Setting the detection threshold to a true positive rate (TPR) of 1.0, a false positive rate (FPR) of <0.1 was obtained across a majority of the 20 exams investigated here. Furthermore, the receiver operating characteristic curve (i.e., FPR vs TPR) exhibited outstanding areas under the curve of >0.9 for nearly all exams. While limitations of this method are described and the analysis is by no means exhaustive, these outcomes suggest that the unique behavior patterns of generative AI chat bots can be identified using Rasch modeling and fit statistics.
科研通智能强力驱动
Strongly Powered by AbleSci AI