Extracting Actionable Insights from Text Data: A Stable Topic Model Approach

计算机科学 主题模型 推论 理论(学习稳定性) 数据科学 情报检索 人工智能 机器学习
作者
Yi Yang,Ramanath Subramanyam
出处
期刊:Management Information Systems Quarterly [MIS Quarterly]
卷期号:47 (3): 923-954 被引量:7
标识
DOI:10.25300/misq/2022/16957
摘要

Topic models are becoming a frequently employed tool in the empirical methods repertoire of information systems and management scholars. Given textual corpora, such as consumer reviews and online discussion forums, researchers and business practitioners often use topic modeling to either explore data in an unsupervised fashion or generate variables of interest for subsequent econometric analysis. However, one important concern stems from the fact that topic models can be notorious for their instability, i.e., the generated results could be inconsistent and irreproducible at different times, even on the same dataset. Therefore, researchers might arrive at potentially unreliable results regarding the theoretical relationships that they are testing or developing. In this paper, we attempt to highlight this problem and suggest a potential approach to addressing it. First, we empirically define and evaluate the stability problem of topic models using four textual datasets. Next, to alleviate the problem and with the goal of extracting actionable insights from textual data, we propose a new method, Stable LDA, which incorporates topical word clusters into the topic model to steer the model inference toward consistent results. We show that the proposed Stable LDA approach can significantly improve model stability while maintaining or even improving the topic model quality. Further, employing two case studies related to an online knowledge community and online consumer reviews, we demonstrate that the variables generated from Stable LDA can lead to more consistent estimations in econometric analyses. We believe that our work can further enhance management scholars’ collective toolkit to analyze ever-growing textual data.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
王文静完成签到,获得积分10
刚刚
刚刚
犹豫的忆梅完成签到,获得积分10
刚刚
cancan谭小面完成签到,获得积分10
1秒前
1秒前
烟波钓客完成签到,获得积分10
2秒前
segovia_tju发布了新的文献求助10
2秒前
打打应助Dakerin2采纳,获得10
2秒前
rye227应助TIGun采纳,获得10
3秒前
安安发布了新的文献求助10
4秒前
5秒前
FashionBoy应助任性的诗柳采纳,获得10
5秒前
5秒前
9秒前
传奇3应助1018wxy采纳,获得10
9秒前
优秀剑愁完成签到 ,获得积分10
9秒前
10秒前
牛头人完成签到,获得积分10
10秒前
Dakerin2发布了新的文献求助10
14秒前
小鱼医生完成签到 ,获得积分10
24秒前
Dakerin2完成签到,获得积分10
27秒前
27秒前
27秒前
28秒前
酷波er应助小杨采纳,获得10
29秒前
1018wxy完成签到,获得积分10
30秒前
含糊的万恶完成签到,获得积分10
30秒前
GTRK完成签到 ,获得积分10
31秒前
1018wxy发布了新的文献求助10
33秒前
34秒前
34秒前
35秒前
ding应助如意草丛采纳,获得10
35秒前
tsumugi完成签到,获得积分10
36秒前
半柚发布了新的文献求助10
39秒前
43秒前
酷波er应助初遇之时最暖采纳,获得10
45秒前
酷炫的八宝粥完成签到,获得积分10
45秒前
46秒前
47秒前
高分求助中
【此为提示信息,请勿应助】请按要求发布求助,避免被关 20000
Continuum Thermodynamics and Material Modelling 2000
Encyclopedia of Geology (2nd Edition) 2000
105th Edition CRC Handbook of Chemistry and Physics 1600
Maneuvering of a Damaged Navy Combatant 650
Mixing the elements of mass customisation 300
the MD Anderson Surgical Oncology Manual, Seventh Edition 300
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3778226
求助须知:如何正确求助?哪些是违规求助? 3323870
关于积分的说明 10216390
捐赠科研通 3039102
什么是DOI,文献DOI怎么找? 1667782
邀请新用户注册赠送积分活动 798389
科研通“疑难数据库(出版商)”最低求助积分说明 758366