Improving mathematics assessment readability: Do large language models help?

可读性 计算机科学 阅读(过程) 数学教育 自然语言处理 理解力 考试(生物学) 人工智能 阅读理解 写作评估 劣势 心理学 语言学 程序设计语言 古生物学 哲学 生物
作者
Nirmal Patel,Pooja Nagpal,Tirth Shah,Archana Sharma,Shrey Malvi,Derek Lomas
出处
期刊:Journal of Computer Assisted Learning [Wiley]
卷期号:39 (3): 804-822 被引量:2
标识
DOI:10.1111/jcal.12776
摘要

Abstract Background Readability metrics provide us with an objective and efficient way to assess the quality of educational texts. We can use the readability measures for finding assessment items that are difficult to read for a given grade level. Hard‐to‐read math word problems can put some students at a disadvantage if they are behind in their literacy learning. Despite their math abilities, these students can perform poorly on difficult‐to‐read word problems because of their poor reading skills. Less readable math tests can create equity issues for students who are relatively new to the language of assessment. Less readable test items can also affect the assessment's construct validity by partially measuring reading comprehension. Objectives This study shows how large language models help us improve the readability of math assessment items. Methods We analysed 250 test items from grades 3 to 5 of EngageNY, an open‐source curriculum. We used the GPT‐3 AI system to simplify the text of these math word problems. We used text prompts and the few‐shot learning method for the simplification task. Results and Conclusions On average, GPT‐3 AI produced output passages that showed improvements in readability metrics, but the outputs had a large amount of noise and were often unrelated to the input. We used thresholds over text similarity metrics and changes in readability measures to filter out the noise. We found meaningful simplifications that can be given to item authors as suggestions for improvement. Takeaways GPT‐3 AI is capable of simplifying hard‐to‐read math word problems. The model generates noisy simplifications using text prompts or few‐shot learning methods. The noise can be filtered using text similarity and readability measures. The meaningful simplifications AI produces are sound but not ready to be used as a direct replacement for the original items. To improve test quality, simplifications can be suggested to item authors at the time of digital question authoring.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
阿木木完成签到,获得积分10
刚刚
qqz发布了新的文献求助10
1秒前
sumugeng完成签到,获得积分10
2秒前
hh完成签到 ,获得积分10
3秒前
fzc发布了新的文献求助10
3秒前
Akim应助hbhbj采纳,获得10
3秒前
hanhan发布了新的文献求助10
5秒前
心木完成签到 ,获得积分10
5秒前
靓仔要亮发布了新的文献求助10
6秒前
Mercury完成签到 ,获得积分10
6秒前
倪妮完成签到,获得积分10
6秒前
四堵墙完成签到,获得积分10
8秒前
SciGPT应助hbhbj采纳,获得10
8秒前
冬雪完成签到,获得积分10
10秒前
xl应助单纯的手机采纳,获得10
10秒前
英俊的铭应助狂野静曼采纳,获得10
11秒前
大个应助hbhbj采纳,获得10
12秒前
13秒前
14秒前
五十完成签到,获得积分10
17秒前
17秒前
阿喔完成签到,获得积分10
19秒前
21秒前
xl应助单纯的手机采纳,获得10
21秒前
丘比特应助海的终章采纳,获得10
21秒前
张鱼小源子完成签到,获得积分10
21秒前
可爱的函函应助hanhan采纳,获得10
22秒前
zyh发布了新的文献求助10
22秒前
88发布了新的文献求助150
23秒前
shusen完成签到,获得积分10
23秒前
DreamSeker完成签到 ,获得积分10
24秒前
26秒前
往往超可爱完成签到 ,获得积分10
26秒前
Me发布了新的文献求助10
27秒前
27秒前
鱼会淹死吗完成签到,获得积分0
27秒前
杜康完成签到,获得积分10
27秒前
css完成签到,获得积分10
29秒前
11完成签到 ,获得积分10
32秒前
52251013106完成签到,获得积分20
32秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
求中国石油大学(北京)图书馆的硕士论文,作者董晨,十年前搞太赫兹的 500
Vertebrate Palaeontology, 5th Edition 500
Narrative Method and Narrative form in Masaccio's Tribute Money 500
Aircraft Engine Design, Third Edition 500
Neonatal and Pediatric ECMO Simulation Scenarios 500
苏州地下水中新污染物及其转化产物的非靶向筛查 500
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 内科学 生物化学 物理 计算机科学 纳米技术 遗传学 基因 复合材料 化学工程 物理化学 病理 催化作用 免疫学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 4774888
求助须知:如何正确求助?哪些是违规求助? 4107517
关于积分的说明 12705438
捐赠科研通 3828541
什么是DOI,文献DOI怎么找? 2112164
邀请新用户注册赠送积分活动 1136034
关于科研通互助平台的介绍 1019650