计算机科学
词(群论)
文字嵌入
相似性(几何)
数据挖掘
未来研究
人工智能
嵌入
钥匙(锁)
外推法
机器学习
数据科学
自然语言处理
数学
统计
图像(数学)
计算机安全
几何学
作者
Necip Gözüaçık,C. Okan Sakar,Sercan Ozcan
标识
DOI:10.1016/j.techfore.2023.122520
摘要
There are a vast number of quantitative and qualitative technological forecasting methods. In the last decade, advanced quantitative technological forecasting methods based on the various applications of data science approaches have been proposed. Text mining is one of the key approaches used to examine large datasets consisting of scientific publications and patent documents with the aim of offering foresight for a selected area. However, the existing related studies either perform a qualitative approach by analysing the recent data to identify the emerging topics or use extrapolation techniques to predict the future values of some statistical terms or the future frequency of some important keywords. In this study, different from such related studies, we propose a deep learning-based framework to predict future co-similarity matrix representing the possible new and disappearing interactions between the words in the future. For this purpose, word vectors are generated using a word embedding technique and the temporal changes of the associations between the words are modelled using Long Short-Term Memory networks for the future estimation of the word embedding matrix. The text mining area is chosen as a case study. The clusters of the terms extracted from the predicted word embedding matrices were analysed and potentially emerging areas were identified for different prediction horizon lengths. The accuracy of the proposed model was analysed based on a set of evaluation metrics that measure the amount of overlapping between the actual and predicted word maps. The quantitative analysis showed that the proposed system can successfully identify the emerging and disappearing areas and can be used as a decision-making tool for the future projection of other areas.
科研通智能强力驱动
Strongly Powered by AbleSci AI