印为红字的
计算机科学
自然语言处理
人工智能
数学教育
心理学
作者
Chamuditha Senanayake,Dinesh Asanka
标识
DOI:10.1109/scse61872.2024.10550624
摘要
The manual grading of short answers presents challenges in education due to time constraints, especially with larger student populations, and suffers from subjectivity and bias, leading to inconsistencies. Larger student populations increase the time needed for individual assessment, leading to potential delays and subjectivity introduces biases, resulting in inconsistent evaluations. Moreover, as student numbers rise, the imbalance in teacher-to-student ratios affects grading quality, impacting fairness and effectiveness in educational assessments. Automated grading systems have emerged as a solution to address these issues. These grading systems mainly prioritize appearance, emphasizing grammar and format. However, they struggle to accurately assess content quality, often missing contextual relevance. This problem can potentially be resolved by employing highly trained domain-specific models. However, a drawback arises as these models are limited to evaluating answers only within predefined domains. While these specialized models excel in assessing responses within their designated fields, their utility is restricted when evaluating answers outside of these predefined domains. This limitation poses a challenge in achieving broader applicability for assessing answers outside the specific areas the models were trained for. This study proposes a rubric-based method paired with Large Language Models (LLMs) to introduce objectivity, ensuring fairness and reliability in evaluations while achieving generalizability. Rubric provides a clear and customizable marking schema for assessing short answers across various domains. By using predetermined marking criteria and conditions, the grading process becomes more objective and transparent. The proposed method efficiently evaluates short answers in various domains using Large Language Models (LLMs), based on these established criteria, reducing subjective biases. This research aims to revolutionize education by creating a robust automated short answer scoring system that comprehensively evaluates contents across domains and addresses teacher-to-student ratio issues.
科研通智能强力驱动
Strongly Powered by AbleSci AI