异方差
超参数
计算机科学
人工神经网络
对数
概率逻辑
人工智能
差异(会计)
机器学习
航程(航空)
高斯分布
似然函数
可解释性
点估计
功能(生物学)
计量经济学
统计
数学
估计理论
算法
数学分析
材料科学
物理
会计
量子力学
进化生物学
业务
复合材料
生物
作者
Maximilian Seitzer,Arash Tavakoli,Dimitrije Antic,Georg Martius
出处
期刊:Cornell University - arXiv
日期:2022-01-01
被引量:26
标识
DOI:10.48550/arxiv.2203.09168
摘要
Capturing aleatoric uncertainty is a critical part of many machine learning systems. In deep learning, a common approach to this end is to train a neural network to estimate the parameters of a heteroscedastic Gaussian distribution by maximizing the logarithm of the likelihood function under the observed data. In this work, we examine this approach and identify potential hazards associated with the use of log-likelihood in conjunction with gradient-based optimizers. First, we present a synthetic example illustrating how this approach can lead to very poor but stable parameter estimates. Second, we identify the culprit to be the log-likelihood loss, along with certain conditions that exacerbate the issue. Third, we present an alternative formulation, termed $\beta$-NLL, in which each data point's contribution to the loss is weighted by the $\beta$-exponentiated variance estimate. We show that using an appropriate $\beta$ largely mitigates the issue in our illustrative example. Fourth, we evaluate this approach on a range of domains and tasks and show that it achieves considerable improvements and performs more robustly concerning hyperparameters, both in predictive RMSE and log-likelihood criteria.
科研通智能强力驱动
Strongly Powered by AbleSci AI