A Method for Extracting Unstructured Threat Intelligence Based on Dictionary Template and Reinforcement Learning

计算机科学人工智能模棱两可机器学习过程（计算）构造（python库）程序设计语言操作系统

作者

Xuren Wang,Rong Chen,Binghua Song,Jian Yang,Zhengwei Jiang,Xiaoqing Zhang,Xiaomeng Li,Shengqin Ao

出处

期刊：Computer Supported Cooperative Work in Design 日期：2021-05-05 被引量：2

标识

DOI：10.1109/cscwd49262.2021.9437858

摘要

In recent years, individuals, organizations and countries are all threatened by cyber threats to some degree. The proposal of threat intelligence sharing scheme has greatly helped the protection of cyber security. Traditional threat intelligence sharing scheme mainly collects and analyzes information manually, which include but not limited to Indicators of Compromise (IOC) and forms a machine readable report for Security Operations Center (SOC) to take corresponding action. Therefore, it is challenging and significant to easily and automatically share and exchange cyber threat intelligence (CTI). Aiming at extracting the information of CTI efficiently, we construct a model of automatic information extraction process of the entity recognition and relationship extraction, which are used to extract effective entities and relationships in threat intelligence reports and improve the efficiency of threat intelligence sharing. The specific content and research results include two aspects: (1) Research on threat intelligence entity recognition model. We use the BERT model as a corpus pre-training model based on the classic neural network BiLSTM-CRF, and proposes a model DT-BERT-BiLSTM-CRF based on the dictionary template. The BERT pre-training model makes full use of the contextual semantic information of the corpus and alleviates the problem of ambiguity in the process of threat intelligence entity recognition. By constructing a dictionary template of threat intelligence entities, the accuracy of entity recognition in the threat intelligence field is further improved. (2) Research on the extraction of ITC relations. We constructed the relation extraction data set with distant supervision methods. For alleviating the noise annotation data, we introduce the attention mechanism and reinforcement learning into traditional neural networks, proposing a model NR-RL-PCNN-ATT. Through a new reward mechanism, our model improves the sentence selection quality and the efficiency of relationship extraction.

求助该文献

最长约 10秒，即可获得该文献文件

A Method for Extracting Unstructured Threat Intelligence Based on Dictionary Template and Reinforcement Learning

今日热心研友