计算机科学
恶意软件
人工智能
数据挖掘
知识图
机器学习
图形
公制(单位)
特征工程
深度学习
情报检索
自然语言处理
理论计算机科学
运营管理
经济
操作系统
作者
Chen Chen,Chunhe Xia,Tianbo Wang,Wanshuang Lin,Yuan Zhao,Li Yang
标识
DOI:10.1109/trustcom60117.2023.00171
摘要
Malware Knowledge Graph (MKG) serves as an essential auxiliary tool for malware detection and analysis. However, the construction of MKG faces several challenges, such as inadequate dataset quality, incomplete entity feature extraction, and the limitations imposed by deep learning techniques. To address these issues, we present an Efficient Framework for constructing a malware knowledge graph (EFwork). Firstly, we build a High-Quality Dataset (HQDataset) and introduce a metric for data quality assessment based on knowledge coverage, timeliness, and density. Subsequently, we develop a Named Entity Recognition (NER) model that extracts character features, part-of-speech features, and word features from the data, leveraging deep learning models to identify malware-related entities. Finally, we implement a rule-based filtering mechanism, utilizing a comprehensive Rule Database to eliminate entities that do not conform to predefined rules. Experimental result shows that our HQDataset demonstrates superior data quality when compared to other open-source datasets. Furthermore, our NER model combined with our Rule Database outperforms existing models, achieving improvements of 0.67%, 0.74%, and 0.69% in Precision, Recall, and F1-Score, respectively.
科研通智能强力驱动
Strongly Powered by AbleSci AI