人格
扎根理论
蒸馏
计算机科学
人机交互
知识管理
社会学
化学
定性研究
色谱法
社会科学
作者
Linmei Hu,Xinyu Zhang,Dandan Song,Changzhi Zhou,Hongyu He,Liqiang Nie
摘要
Incorporating explicit personas into dialogue models is critical for generating responses that fulfill specific user needs and preferences, creating a more personalized and engaging interaction. Early works on persona-based dialogue generation directly concatenate the persona descriptions and dialogue history into relatively small pre-trained language models (PLMs) for response generation, which leads to uninformative and inferior results due to the sparse persona information and the limited model generation capabilities. Recently, large language models (LLMs) have shown their surprising capabilities in language generation. Prompting the LLMs with the persona descriptions for role-playing dialogue generation has also achieved promising results. However, deploying LLMs is challenging for practical applications due to their large scale, spurring efforts to distill the generation capabilities into more concise and compact models through teacher-student learning. In this paper, we propose an efficient compact K nowledge-grounded P ersona-based D ialogue model enhanced by LLM D istillation (KPDD). Specifically, first, we propose to enrich the annotated persona descriptions by integrating external knowledge graphs (KGs) with a mixed encoding network, coupled with a mixture of experts (MoE) module for both informative and diverse response generation. The mixed encoding network contains multiple layers of modality interaction operations, enabling information from both modalities propagates to the other. Second, to fully exploit the generation capabilities of LLMs, we turn to the distillation technique to improve the generation capabilities of our model, facilitated by a natural language inference (NLI) based filtering mechanism to extract high-quality information from LLMs. In addition, we employ a curriculum learning strategy to train our model on the high-quality filtered distilled data and progressively on the relatively noisy original data, enhancing its adaptability and performance. Extensive experiments show that KPDD outperforms state-of-the-art baselines in terms of both automatic and human evaluation.
科研通智能强力驱动
Strongly Powered by AbleSci AI