计算机科学
自然语言处理
人工智能
超球体
嵌入
知识图
图形
词(群论)
连贯性(哲学赌博策略)
主题模型
情报检索
理论计算机科学
数学
统计
几何学
作者
Hafsa Ennajari,Nizar Bouguila,Jamal Bentahar
标识
DOI:10.1109/tpami.2025.3550032
摘要
With the prevalence of short texts in various forms such as news headlines, tweets, and reviews, short text analysis has gained significant interest in recent times. However, modeling short texts remains a challenging task due to its sparse and noisy nature. In this paper, we propose a new Spherical Correlated Topic Model (SCTM), which takes into account the correlation between topics. Our model integrates word and knowledge graph embeddings to better capture the semantic relationships among short texts. We adopt the von Mises-Fisher distribution to model the high-dimensional word and entity embeddings on a hypersphere, enabling better preservation of the angular relationships between topic vectors. Moreover, knowledge graph embeddings are incorporated to further enrich the semantic meaning of short texts. Experimental results on several datasets demonstrate that our proposed SCTM model outperforms existing models in terms of both topic coherence and document classification. In addition, our model is capable of providing interpretable topics and revealing meaningful correlations among short texts.
科研通智能强力驱动
Strongly Powered by AbleSci AI