计算机科学
命名实体识别
文字嵌入
人工神经网络
特征(语言学)
人工智能
图形
实体链接
编码器
嵌入
任务(项目管理)
机器学习
知识库
理论计算机科学
哲学
经济
管理
操作系统
语言学
作者
Xiaodi Wang,Jiayong Liu
标识
DOI:10.1016/j.knosys.2022.110114
摘要
Owing to continuous cyberattacks, a large amount of threat intelligence is generated online every day. However, threat intelligence is mostly unstructured and multisource heterogeneous text. It is difficult for security analysts to understand the implicit threat in time. Knowledge Graph (KG) is an important research topic in recent years, which can perform automated and real-time analysis of threat intelligence in cybersecurity. As one of the critical technologies of KG, named entity recognition (NER) can identify cyberattack-related entities. It has been proved that long-distance structured information captured by dependency trees provides a rich semantic expression for the neural network. However, existing research works are more focused on the simple linear stack of neural networks when utilizing structured features. The interaction between different types of neural networks is vague. In addition, the existing models are insensitive to the boundaries of complex entity terms in cybersecurity. In this study, we propose a new feature integration and entity boundary detection (FIEBD) model. In our model, a new pretrained language model, PERT, is applied to obtain word embedding of cyber texts. Moreover, a novel neural network cell, namely GARU, is developed to incorporate different types of features extracted from graph neural networks and recurrent neural networks. It combines the graph encoder with the gate mechanism, aiming to obtain better hidden representation by explicit interaction. Furthermore, considering a large number of complex entities in cybersecurity, we contribute an entity boundary detection module to perform entity head and tail prediction as an augmentation task. We conduct extensive experiments on cybersecurity datasets. The results demonstrate that the proposed model achieves better performance than existing baseline methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI