Continual Learning of Large Language Models: A Comprehensive Survey

计算机科学自然语言处理人工智能

作者

Haizhou Shi,Zihao Xu,Hengyi Wang,Weiyi Qin,Wenyuan Wang,Yibin Wang,Zifeng Wang,Sayna Ebrahimi,Hao Wang

出处

期刊：ACM Computing Surveys [Association for Computing Machinery]
日期：2025-05-14 卷期号：58 (5): 1-42 被引量：19

标识

摘要

The challenge of effectively and efficiently adapting statically pre-trained Large Language Models (LLMs) to ever-evolving data distributions remains predominant. When tailored for specific needs, pre-trained LLMs often suffer from significant performance degradation in previous knowledge domains—a phenomenon known as “catastrophic forgetting” . While extensively studied in the Continual Learning (CL) community, this problem presents new challenges in the context of LLMs. In this survey, we provide a comprehensive overview and detailed discussion of the current research progress on LLMs within the context of CL. Besides the introduction of the preliminary knowledge, this survey is structured into four main sections: we first describe an overview of continually learning LLMs, consisting of two directions of continuity: vertical continuity (or vertical continual learning) , i.e., continual adaptation from general to specific capabilities, and horizontal continuity (or horizontal continual learning) , i.e., continual adaptation across time and domains (Section 3 ). Following vertical continuity, we summarize three stages of learning LLMs in the context of modern CL: Continual Pre-Training (CPT), Domain-Adaptive Pre-training (DAP), and Continual Fine-Tuning (CFT) (Section 4 ). We then provide an overview of evaluation protocols for continual learning with LLMs, along with currently available data sources (Section 5 ). Finally, we discuss intriguing questions related to continual learning for LLMs (Section 6 ). This survey sheds light on the relatively understudied domain of continually pre-training, adapting, and fine-tuning large language models, suggesting the necessity for greater attention from the community. Key areas requiring immediate focus include the development of practical and accessible evaluation benchmarks, along with methodologies specifically designed to counter forgetting and enable knowledge transfer within the evolving landscape of LLM learning paradigms. The full list of articles examined in this survey is available at https://github.com/Wang-ML-Lab/llm-continual-learning-survey.

求助该文献

最长约 10秒，即可获得该文献文件

Continual Learning of Large Language Models: A Comprehensive Survey

今日热心研友