计算机科学
内部威胁
知情人
上传
互联网
特征选择
数据挖掘
集合(抽象数据类型)
机器学习
过程(计算)
人工智能
数据集
噪音(视频)
万维网
政治学
法学
图像(数学)
程序设计语言
操作系统
作者
Anupam Mittal,Urvashi Garg
标识
DOI:10.1109/icecct56650.2023.10179686
摘要
Data is critical for large as well as small organizations as customer trust depends upon the privacy of information maintained. The key tool that every organization uses for assessing resources is the Internet. The use of technology and the Internet comes with a cost. This cost is in the form of cyber-attacks that exist over the Internet. One of the hardest attacks to detect is the insider That occurs from within the organization. The organization must distinguish between the employers as well as the insiders. This paper purposed an optimization-based strategy for the detection of insider threats. Spider monkey optimization is applied to detect the sentiments present within the R4.2 cert data set. This data set has been generated by the university of Carnegie Mellon and is used to detect insider threats. The overall process of insider threat detection Starts with downloading the data set from github. The downloaded files have been compressed so to use them; they must be extracted. A large number of files are contained within the cert dataset. For the proposed work(SMLDA optimization), email and psychometric datasets are shortlisted. After extracting the dataset, pre-processing phase is applied. Within pre-processing, noise in terms of missing values is tackled. This is achieved by rejecting records containing null values within individual cells of the dataset. after pre-processing, dataset features are extracted using the content field of the dataset along with the natural language processing toolbox. Feature selection is performed using the Spyder monkey approach. Selection will be based on the contribution factor calculated with linear discriminant analysis. Using SMO, the highest contribution document built with LDA will be selected. In the end, the polarity of the document is calculated using the TextBlob library. The result of the SMO-driven sentiment analysis (anger, neutral, negative, positive, and sad) is compared with the plain LDA approach. SMO-driven sentiment analysis generates a classification accuracy of 99% and the LDA approach generates a classification accuracy of 90%. Furthermore, it was discovered that negative and sad sentiments most resulted in insider threats.
科研通智能强力驱动
Strongly Powered by AbleSci AI