Efficient and Effective Role Player: A Compact Knowledge-grounded Persona-based Dialogue Model Enhanced by LLM Distillation

人格扎根理论蒸馏计算机科学人机交互知识管理社会学化学定性研究色谱法社会科学

作者

Linmei Hu,Xinyu Zhang,Dandan Song,Changzhi Zhou,Hongyu He,Liqiang Nie

出处

期刊：ACM Transactions on Information Systems [Association for Computing Machinery]
日期：2025-01-10

标识

摘要

Incorporating explicit personas into dialogue models is critical for generating responses that fulfill specific user needs and preferences, creating a more personalized and engaging interaction. Early works on persona-based dialogue generation directly concatenate the persona descriptions and dialogue history into relatively small pre-trained language models (PLMs) for response generation, which leads to uninformative and inferior results due to the sparse persona information and the limited model generation capabilities. Recently, large language models (LLMs) have shown their surprising capabilities in language generation. Prompting the LLMs with the persona descriptions for role-playing dialogue generation has also achieved promising results. However, deploying LLMs is challenging for practical applications due to their large scale, spurring efforts to distill the generation capabilities into more concise and compact models through teacher-student learning. In this paper, we propose an efficient compact K nowledge-grounded P ersona-based D ialogue model enhanced by LLM D istillation (KPDD). Specifically, first, we propose to enrich the annotated persona descriptions by integrating external knowledge graphs (KGs) with a mixed encoding network, coupled with a mixture of experts (MoE) module for both informative and diverse response generation. The mixed encoding network contains multiple layers of modality interaction operations, enabling information from both modalities propagates to the other. Second, to fully exploit the generation capabilities of LLMs, we turn to the distillation technique to improve the generation capabilities of our model, facilitated by a natural language inference (NLI) based filtering mechanism to extract high-quality information from LLMs. In addition, we employ a curriculum learning strategy to train our model on the high-quality filtered distilled data and progressively on the relatively noisy original data, enhancing its adaptability and performance. Extensive experiments show that KPDD outperforms state-of-the-art baselines in terms of both automatic and human evaluation.

求助该文献

最长约 10秒，即可获得该文献文件

Efficient and Effective Role Player: A Compact Knowledge-grounded Persona-based Dialogue Model Enhanced by LLM Distillation

今日热心研友