计算机科学
词(群论)
文件分类
自然语言处理
人工智能
联营
向量空间
向量空间模型
情报检索
空格(标点符号)
数学
几何学
操作系统
作者
Xing Chao,Dong Wang,Xuewei Zhang,Chao Liu
标识
DOI:10.1109/apsipa.2014.7041633
摘要
The word-to-vector (W2V) technique represents words as low-dimensional continuous vectors in such a way that semantic related words are close to each other. This produces a semantic space where a word or a word collection (e.g., a document) can be well represented, and thus lends itself to a multitude of applications including document classification. Our previous study demonstrated that representations derived from word vectors are highly promising in document classification and can deliver better performance than the conventional LDA model. This paper extends the previous research and proposes to model distributions of word vectors in documents or document classes. This extends the naive approach to deriving document representations by average pooling and explores the possibility of modeling documents in the semantic space. Experiments on the sohu text database confirmed that the new approach may produce better performance on document classification.
科研通智能强力驱动
Strongly Powered by AbleSci AI