计算机科学
恶意软件
软件
序列(生物学)
特征(语言学)
嵌入
人工智能
数据挖掘
对象(语法)
卷积神经网络
机器学习
程序设计语言
操作系统
哲学
生物
遗传学
语言学
作者
Ce Li,Qiujian Lv,Ning Li,Yan Wang,Degang Sun,Yuanyuan Qiao
标识
DOI:10.1016/j.cose.2022.102686
摘要
Dynamic malware detection executes the software in a secured virtual environment and monitors its run-time behavior. This technique widely uses API sequence analysis to identify whether the running software is malicious or not. However, existing solutions typically only consider the API name or frequency of API usage, and the feature mining of API sequence is not sufficient, which leads some malware to escape from being detected. In this paper, we propose a novel malware detection framework using deep learning models to capture and combine more meaningful features which are called intrinsic features of the API sequence. Specifically, we first apply embedding and convolutional layers to conduct a joint representation of multiple APIs to represent the software behavior. Secondly, we use the category, action, and operation object of the API to represent the semantic information of each API call. Finally, we use the Bi-LSTM module to mine the relationship information between APIs. Our proposed model achieves an accuracy of 0.9731 and an F1-score of 0.9724 on a large real dataset, which outperforms baselines significantly. We also conduct ablation studies to prove the effectiveness of our intrinsic features.
科研通智能强力驱动
Strongly Powered by AbleSci AI