计算机科学
机器学习
传感器融合
人工智能
过程(计算)
数据挖掘
数据集
数据建模
集合(抽象数据类型)
领域知识
数据库
程序设计语言
操作系统
作者
Amal Saadallah,Felix Finkeldey,Jens Buß,Katharina Morik,Petra Wiederkehr,W. Rhode
标识
DOI:10.1016/j.aei.2022.101600
摘要
The performance of machine learning algorithms depends to a large extent on the amount and the quality of data available for training. Simulations are most often used as test-beds for assessing the performance of trained models on simulated environment before deployment in real-world. They can also be used for data annotation, i.e, assigning labels to observed data, providing thus background knowledge for domain experts. We want to integrate this knowledge into the machine learning process and, at the same time, use the simulation as an additional data source. Therefore, we present a framework that allows for the combination of real-world observations and simulation data at two levels, namely the data or the model level. At the data level, observations and simulation data are integrated to form an enriched data set for learning. At the model level, the models learned from observed and simulated data separately are combined using an ensemble technique. Based on the trade-off between model bias and variance, an automatic selection of the appropriate fusion level is proposed. Our framework is validated using two case studies of very different types. The first is an industry 4.0 use case consisting of monitoring a milling process in real-time. The second is an application in astroparticle physics for background suppression.
科研通智能强力驱动
Strongly Powered by AbleSci AI