过度拟合
一般化
计算机科学
人工神经网络
人工智能
单调函数
变量(数学)
过程(计算)
运动(物理)
机器学习
数学
数学分析
操作系统
作者
Tomohisa Okazaki,Nobuyuki Morikawa,Hiroyuki Fujiwara,Naonori Ueda
摘要
Abstract Data-driven machine-learning approaches are being increasingly applied to construct empirical ground-motion models (GMMs). It is a standard practice to divide observational records into learning and test datasets to correctly evaluate the predictive performance of a developed model. However, in this study, we show that division based on records or earthquakes is inappropriate for evaluating the generalization performance on recorded sites when GMMs include site-condition proxies as input variables. Complex models exhibit small residuals at sites used in the training process, but exhibit large residuals at new sites owing to overfitting to the trained sites. As a simple solution, we propose a neural network model that has monotonic dependence on some of the input variables. The model successfully obtains the generalization performance on recorded sites, although it lacks ability to represent oversaturation with input variables suggested in extreme ground-motion ranges. Therefore, alternative methods should be investigated to develop robust data-driven models under general conditions. Dividing the sites into learning and test data would play a fundamental role in developing such robust models.
科研通智能强力驱动
Strongly Powered by AbleSci AI