Research on Unified Cyber Threat Intelligence Entity Recognition Method Based on Multiple Features
计算机科学
情报分析
人工智能
计算机安全
作者
Yu Chang,Gang Wang,Peng Zhu,Jingheng He,Lingfei Kong
标识
DOI:10.1109/cait59945.2023.10469250
摘要
The frequent occurrence of APT attacks targeting industrial control systems has made the knowledge graph of CTI increasingly crucial for recognizing and analyzing APT attacks. However, the majority of CTI exists in the form of unstructured natural language content, which needs NER techniques to extract the threat entities within it. Nevertheless, NER research for CTI still has the problems of Out-of-Vocabulary (OOV), discontinuous entity recognition and unbalanced data distribution. Therefore, in order to extract the threat entities in CTI more comprehensively and accurately, this paper proposes a unified threat entity recognition model based on multiple features. Firstly, the model enhances the semantic feature by integrating the character feature, the POS feature and the position feature which mitigates the OOV problems. Next, introduces the method of recognizing discontinuous entities based on word-word relation classification into the NER for CTI. Finally, incorporates the Focal Loss algorithm, reducing the impact of non-entity data on the loss function through the weighting factor and the modulating factor. The experimental results on the CTI dataset DNRTI demonstrate that compared with the existing threat entity recognition methods, the F1 value of the present model for recognizing threat entities reaches 89.41%, which is an improvement of 0.48%, and it is able to effectively recognize discontinuous entities.