干扰素
层次Dirichlet过程
广义Dirichlet分布
潜在Dirichlet分配
混合模型
数学
Dirichlet分布
推论
层次聚类
聚类分析
计算机科学
主题模型
狄里克莱能量
统计
人工智能
边值问题
数学分析
作者
Yee Whye Teh,Michael I. Jordan,Matthew J. Beal,David M. Blei
标识
DOI:10.1198/016214506000000302
摘要
We consider problems involving groups of data where each observation within a group is a draw from a mixture model and where it is desirable to share mixture components between groups. We assume that the number of mixture components is unknown a priori and is to be inferred from the data. In this setting it is natural to consider sets of Dirichlet processes, one for each group, where the well-known clustering property of the Dirichlet process provides a nonparametric prior for the number of mixture components within each group. Given our desire to tie the mixture models in the various groups, we consider a hierarchical model, specifically one in which the base measure for the child Dirichlet processes is itself distributed according to a Dirichlet process. Such a base measure being discrete, the child Dirichlet processes necessarily share atoms. Thus, as desired, the mixture models in the different groups necessarily share mixture components. We discuss representations of hierarchical Dirichlet processes in terms of a stick-breaking process, and a generalization of the Chinese restaurant process that we refer to as the “Chinese restaurant franchise.” We present Markov chain Monte Carlo algorithms for posterior inference in hierarchical Dirichlet process mixtures and describe applications to problems in information retrieval and text modeling.
科研通智能强力驱动
Strongly Powered by AbleSci AI