Improving mathematics assessment readability: Do large language models help?

可读性 计算机科学 阅读(过程) 数学教育 自然语言处理 理解力 考试(生物学) 人工智能 阅读理解 写作评估 劣势 心理学 语言学 程序设计语言 古生物学 哲学 生物
作者
Nirmal Patel,Pooja Nagpal,Tirth Shah,Archana Sharma,Shrey Malvi,Derek Lomas
出处
期刊:Journal of Computer Assisted Learning [Wiley]
卷期号:39 (3): 804-822 被引量:2
标识
DOI:10.1111/jcal.12776
摘要

Abstract Background Readability metrics provide us with an objective and efficient way to assess the quality of educational texts. We can use the readability measures for finding assessment items that are difficult to read for a given grade level. Hard‐to‐read math word problems can put some students at a disadvantage if they are behind in their literacy learning. Despite their math abilities, these students can perform poorly on difficult‐to‐read word problems because of their poor reading skills. Less readable math tests can create equity issues for students who are relatively new to the language of assessment. Less readable test items can also affect the assessment's construct validity by partially measuring reading comprehension. Objectives This study shows how large language models help us improve the readability of math assessment items. Methods We analysed 250 test items from grades 3 to 5 of EngageNY, an open‐source curriculum. We used the GPT‐3 AI system to simplify the text of these math word problems. We used text prompts and the few‐shot learning method for the simplification task. Results and Conclusions On average, GPT‐3 AI produced output passages that showed improvements in readability metrics, but the outputs had a large amount of noise and were often unrelated to the input. We used thresholds over text similarity metrics and changes in readability measures to filter out the noise. We found meaningful simplifications that can be given to item authors as suggestions for improvement. Takeaways GPT‐3 AI is capable of simplifying hard‐to‐read math word problems. The model generates noisy simplifications using text prompts or few‐shot learning methods. The noise can be filtered using text similarity and readability measures. The meaningful simplifications AI produces are sound but not ready to be used as a direct replacement for the original items. To improve test quality, simplifications can be suggested to item authors at the time of digital question authoring.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
Pan发布了新的文献求助10
2秒前
星辰大海应助sage采纳,获得10
2秒前
2秒前
小陈住垃圾桶完成签到,获得积分10
5秒前
懒回顾完成签到,获得积分10
6秒前
7秒前
SOLOMON应助心有千千结采纳,获得10
8秒前
寻道图强应助心有千千结采纳,获得30
8秒前
烟花应助皮念寒采纳,获得10
9秒前
11秒前
蓬莱塔图完成签到 ,获得积分10
13秒前
我是老大应助oops采纳,获得10
14秒前
大水发布了新的文献求助30
15秒前
花痴的衫发布了新的文献求助20
15秒前
爆米花应助美美熊采纳,获得10
15秒前
yanwan发布了新的文献求助10
16秒前
李健应助秋天采纳,获得10
18秒前
漂亮的善愁关注了科研通微信公众号
20秒前
20秒前
21秒前
谁家的小陆完成签到,获得积分10
23秒前
cctv18应助下下潜采纳,获得10
24秒前
25秒前
lina发布了新的文献求助10
26秒前
愉快的晓啸完成签到,获得积分10
26秒前
26秒前
情怀应助皮念寒采纳,获得10
26秒前
26秒前
因默完成签到 ,获得积分10
27秒前
28秒前
美美熊发布了新的文献求助20
29秒前
刘巧明完成签到 ,获得积分10
30秒前
122发布了新的文献求助10
30秒前
小菜鸡发布了新的文献求助10
30秒前
30秒前
tsukinineko发布了新的文献求助10
31秒前
可爱的函函应助Pan采纳,获得10
31秒前
31秒前
31秒前
pan应助麦片采纳,获得20
36秒前
高分求助中
请在求助之前详细阅读求助说明!!!! 20000
One Man Talking: Selected Essays of Shao Xunmei, 1929–1939 1000
The Three Stars Each: The Astrolabes and Related Texts 900
Yuwu Song, Biographical Dictionary of the People's Republic of China 700
[Lambert-Eaton syndrome without calcium channel autoantibodies] 520
Bernd Ziesemer - Maos deutscher Topagent: Wie China die Bundesrepublik eroberte 500
A radiographic standard of reference for the growing knee 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 有机化学 工程类 生物化学 纳米技术 物理 内科学 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 电极 光电子学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 2471941
求助须知:如何正确求助?哪些是违规求助? 2138269
关于积分的说明 5449223
捐赠科研通 1862193
什么是DOI,文献DOI怎么找? 926101
版权声明 562752
科研通“疑难数据库(出版商)”最低求助积分说明 495334