A novel feature integration and entity boundary detection for named entity recognition in cybersecurity

计算机科学命名实体识别文字嵌入人工神经网络特征（语言学）人工智能图形实体链接编码器嵌入任务（项目管理）机器学习知识库理论计算机科学哲学经济管理操作系统语言学

作者

Xiaodi Wang,Jiayong Liu

出处

期刊：Knowledge Based Systems [Elsevier]
日期：2022-11-16 卷期号：260: 110114-110114 被引量：38

标识

DOI：10.1016/j.knosys.2022.110114

摘要

Owing to continuous cyberattacks, a large amount of threat intelligence is generated online every day. However, threat intelligence is mostly unstructured and multisource heterogeneous text. It is difficult for security analysts to understand the implicit threat in time. Knowledge Graph (KG) is an important research topic in recent years, which can perform automated and real-time analysis of threat intelligence in cybersecurity. As one of the critical technologies of KG, named entity recognition (NER) can identify cyberattack-related entities. It has been proved that long-distance structured information captured by dependency trees provides a rich semantic expression for the neural network. However, existing research works are more focused on the simple linear stack of neural networks when utilizing structured features. The interaction between different types of neural networks is vague. In addition, the existing models are insensitive to the boundaries of complex entity terms in cybersecurity. In this study, we propose a new feature integration and entity boundary detection (FIEBD) model. In our model, a new pretrained language model, PERT, is applied to obtain word embedding of cyber texts. Moreover, a novel neural network cell, namely GARU, is developed to incorporate different types of features extracted from graph neural networks and recurrent neural networks. It combines the graph encoder with the gate mechanism, aiming to obtain better hidden representation by explicit interaction. Furthermore, considering a large number of complex entities in cybersecurity, we contribute an entity boundary detection module to perform entity head and tail prediction as an augmentation task. We conduct extensive experiments on cybersecurity datasets. The results demonstrate that the proposed model achieves better performance than existing baseline methods.

求助该文献

最长约 10秒，即可获得该文献文件

A novel feature integration and entity boundary detection for named entity recognition in cybersecurity

今日热心研友