判别式
计算机科学
概率逻辑
机器学习
传感器融合
可靠性(半导体)
人工智能
生成模型
光学(聚焦)
领域(数学分析)
融合
数据挖掘
代表(政治)
统计模型
生成语法
数学
光学
物理
政治
数学分析
哲学
量子力学
语言学
功率(物理)
法学
政治学
作者
Manas Joglekar,Θεόδωρος Ρεκατσίνας,Héctor García-Molina,Aditya Parameswaran,Christopher Ré
出处
期刊:Cornell University - arXiv
日期:2015-12-21
被引量:1
摘要
We focus on data fusion, i.e., the problem of unifying conflicting data from data sources into a single representation by estimating the source accuracies. We propose SLiMFast, a framework that expresses data fusion as a statistical learning problem over discriminative probabilistic models, which in many cases correspond to logistic regression. In contrast to previous approaches that use complex generative models, discriminative models make fewer distributional assumptions over data sources and allow us to obtain rigorous theoretical guarantees. Furthermore, we show how SLiMFast enables incorporating domain knowledge into data fusion, yielding accuracy improvements of up to 50\% over state-of-the-art baselines. Building upon our theoretical results, we design an optimizer that obviates the need for users to manually select an algorithm for learning SLiMFast's parameters. We validate our optimizer on multiple real-world datasets and show that it can accurately predict the learning algorithm that yields the best data fusion results.
科研通智能强力驱动
Strongly Powered by AbleSci AI