Design And Analysis Of Insider Threat Detection And Prediction System Using Machine Learning Techniques

计算机科学内部威胁知情人上传互联网特征选择数据挖掘集合（抽象数据类型）机器学习过程（计算）人工智能数据集噪音（视频）万维网政治学法学图像（数学）程序设计语言操作系统

作者

Anupam Mittal,Urvashi Garg

标识

DOI：10.1109/icecct56650.2023.10179686

摘要

Data is critical for large as well as small organizations as customer trust depends upon the privacy of information maintained. The key tool that every organization uses for assessing resources is the Internet. The use of technology and the Internet comes with a cost. This cost is in the form of cyber-attacks that exist over the Internet. One of the hardest attacks to detect is the insider That occurs from within the organization. The organization must distinguish between the employers as well as the insiders. This paper purposed an optimization-based strategy for the detection of insider threats. Spider monkey optimization is applied to detect the sentiments present within the R4.2 cert data set. This data set has been generated by the university of Carnegie Mellon and is used to detect insider threats. The overall process of insider threat detection Starts with downloading the data set from github. The downloaded files have been compressed so to use them; they must be extracted. A large number of files are contained within the cert dataset. For the proposed work(SMLDA optimization), email and psychometric datasets are shortlisted. After extracting the dataset, pre-processing phase is applied. Within pre-processing, noise in terms of missing values is tackled. This is achieved by rejecting records containing null values within individual cells of the dataset. after pre-processing, dataset features are extracted using the content field of the dataset along with the natural language processing toolbox. Feature selection is performed using the Spyder monkey approach. Selection will be based on the contribution factor calculated with linear discriminant analysis. Using SMO, the highest contribution document built with LDA will be selected. In the end, the polarity of the document is calculated using the TextBlob library. The result of the SMO-driven sentiment analysis (anger, neutral, negative, positive, and sad) is compared with the plain LDA approach. SMO-driven sentiment analysis generates a classification accuracy of 99% and the LDA approach generates a classification accuracy of 90%. Furthermore, it was discovered that negative and sad sentiments most resulted in insider threats.

求助该文献

最长约 10秒，即可获得该文献文件

Design And Analysis Of Insider Threat Detection And Prediction System Using Machine Learning Techniques

今日热心研友