Leveraging RAG-Enhanced Large Language Model for Semi-Supervised Log Anomaly Detection
异常检测
计算机科学
人工智能
自然语言处理
作者
Wanhao Zhang,Qianli Zhang,Enyu Yu,Yuxiang Ren,Yeqing Meng,Miao Qiu,Jilong Wang
标识
DOI:10.1109/issre62328.2024.00026
摘要
Log-based anomaly detection is critical in monitoring the operations of information systems and in the real-time reporting of system failures. Utilizing deep learning-based log anomaly detection methods facilitates effective detection of anomalies within logs. However, existing methods are greatly dependent on log parsers, and parsing errors can considerably affect downstream anomaly detection tasks. Additionally, methods that predict the next log event in a sequence are susceptible to the instability of sequences and the emergence of unseen logs as systems evolve, resulting in a higher false positive rate. In this paper, we put forward LogRAG, a semi-supervised log anomaly detection framework based on retrieval-augmented generation (RAG). This framework conducts phased detection using both Log Tokens and Log Templates to mitigate the impact of log parsing errors. It also utilizes a single-class classifier to model the normal behavior of the system, thereby circumventing the effects of unstable sequences. Finally, it employs large language model (LLM) empowered by RAG to reevaluate detected anomalous logs, thereby improving accuracy. LogRAG demonstrates a 15% improvement in F1 Score on the BGL dataset and a 60% improvement on the Spirit dataset when compared to the previous best semi-supervised learning algorithm.