计算机科学
不确定度量化
人工智能
机器学习
可扩展性
稳健性(进化)
校准
背景(考古学)
集成学习
深度学习
数学
统计
数据库
基因
古生物学
生物
化学
生物化学
作者
Dell Zhang,Murat Sensoy,Masoud Makrehchi,Bilyana Taneva-Popova,Lin Gui,Yulan He
标识
DOI:10.1145/3539618.3594243
摘要
This full-day tutorial introduces modern techniques for practical uncertainty quantification specifically in the context of multi-class and multi-label text classification. First, we explain the usefulness of estimating aleatoric uncertainty and epistemic uncertainty for text classification models. Then, we describe several state-of-the-art approaches to uncertainty quantification and analyze their scalability to big text data: Virtual Ensemble in GBDT, Bayesian Deep Learning (including Deep Ensemble, Monte-Carlo Dropout, Bayes by Backprop, and their generalization Epistemic Neural Networks), Evidential Deep Learning (including Prior Networks and Posterior Networks), as well as Distance Awareness (including Spectral-normalized Neural Gaussian Process and Deep Deterministic Uncertainty). Next, we talk about the latest advances in uncertainty quantification for pre-trained language models (including asking language models to express their uncertainty, interpreting uncertainties of text classifiers built on large-scale language models, uncertainty estimation in text generation, calibration of language models, and calibration for in-context learning). After that, we discuss typical application scenarios of uncertainty quantification in text classification (including in-domain calibration, cross-domain robustness, and novel class detection). Finally, we list popular performance metrics for the evaluation of uncertainty quantification effectiveness in text classification. Practical hands-on examples/exercises are provided to the attendees for them to experiment with different uncertainty quantification methods on a few real-world text classification datasets such as CLINC150.
科研通智能强力驱动
Strongly Powered by AbleSci AI