比例(比率)
计算机科学
人工智能
机器学习
数据科学
心理学
地理
地图学
作者
Chong Ho Yu,Zizhong Xiao,Janet Hanson
标识
DOI:10.1007/978-981-99-9379-6_13
摘要
Educational researchers have been utilizing large-scale assessment data from cross-cultural studies. However, the complexity of these big data poses several challenges to traditional statistical models, despite the benefits they provide to decision support and pedagogical practice. First, the sample size and the statistical power of these archival data are so huge that any trivial effect might be misidentified as significant. Consequently, this overfit model has low generalizability, contributing to the replication crisis. Second, in addition to academic performance test scores, these large-scale assessments also collect data about students' family backgrounds, living environment, educational institution characteristics, and other aspects associated with learning, such as motivation, engagement, and well-being. This curse of high dimensionality often results in multicollinearity, thus yielding an unstable model. To rectify the situation, machine learning methods, such as neural networks, bagging, gradient boosting, and XGBoost can localize errors by partitioning the data into subsets and generating numerous submodels. The final model is a synthesis of repeated analyses and therefore big data can be processed effectively. Further, through regularization machine learning penalizes complexity, preventing irrelevant predictors from entering the model. An analysis of the relationship between well-being and academic performance based on 2018 PISA data is presented in this chapter as an example of how machine learning can be applied to educational research.
科研通智能强力驱动
Strongly Powered by AbleSci AI