计算机科学
脆弱性(计算)
人工智能
深度学习
机器学习
卷积神经网络
循环神经网络
假阳性率
依赖关系(UML)
数据挖掘
人工神经网络
计算机安全
作者
Zhen Li,Deqing Zou,Jing Tang,Zhihao Zhang,Mucun Sun,Hai Jin
出处
期刊:IEEE Access
[Institute of Electrical and Electronics Engineers]
日期:2019-01-01
卷期号:7: 103184-103197
被引量:62
标识
DOI:10.1109/access.2019.2930578
摘要
Source code static analysis has been widely used to detect vulnerabilities in the development of software products. The vulnerability patterns purely based on human experts are laborious and error prone, which has motivated the use of machine learning for vulnerability detection. In order to relieve human experts of defining vulnerability rules or features, a recent study shows the feasibility of leveraging deep learning to detect vulnerabilities automatically. However, the impact of different factors on the effectiveness of vulnerability detection is unknown. In this paper, we collect two datasets from the programs involving 126 types of vulnerabilities, on which we conduct the first comparative study to quantitatively evaluate the impact of different factors on the effectiveness of vulnerability detection. The experimental results show that accommodating control dependency can increase the overall effectiveness of vulnerability detection F1-measure by 20.3%; the imbalanced data processing methods are not effective for the dataset we create; bidirectional recurrent neural networks (RNNs) are more effective than unidirectional RNNs and convolutional neural network, which in turn are more effective than multi-layer perception; using the last output corresponding to the time step for the bidirectional long short-term memory (BLSTM) can reduce the false negative rate by 2.0% at the price of increasing the false positive rate by 0.5%.
科研通智能强力驱动
Strongly Powered by AbleSci AI