潜在Dirichlet分配
联营
主题模型
计算机科学
社会化媒体
相似性(几何)
方案(数学)
情报检索
质量(理念)
万维网
数据科学
人工智能
数学分析
哲学
图像(数学)
认识论
数学
作者
Prateek Mehta,Vasudeva Varma
标识
DOI:10.1109/fruct.2016.7584770
摘要
Topic models such as Latent Dirichlet Allocation(LDA) have historically served as a successful tool for various data mining applications on conventional documents such as news articles or academic abstracts. However, standard use of topic models on social media posts pose several poblems because social media posts are short, messy and generated non-uniformly by the users of the social media platforms. In this paper we propose a new approach of community based document pooling to train better topic models over social media posts and address these problems without modifying the basic machinery of LDA. We compare our approach to the popular user based pooling scheme and show significant improvement in the quality of topic models.
科研通智能强力驱动
Strongly Powered by AbleSci AI