Large Language Model-Aware In-Context Learning for Code Generation

计算机科学 代码生成 背景(考古学) 编码(集合论) 程序设计语言 软件工程 人工智能 计算机安全 钥匙(锁) 生物 古生物学 集合(抽象数据类型)
作者
Jia Li,Chongyang Tao,Jia Li,Ge Li,Zhi Jin,Huangzhao Zhang,Zheng Fang,Fang Liu
出处
期刊:ACM Transactions on Software Engineering and Methodology [Association for Computing Machinery]
被引量:8
标识
DOI:10.1145/3715908
摘要

Large Language Models (LLMs) have shown impressive In-Context Learning (ICL) ability in code generation. LLMs take a prompt context consisting of a few demonstration examples and a new requirement as input, and output new programs without any parameter update. Existing studies have found that the performance of ICL-based code generation heavily depends on the quality of demonstration examples and thus arises research on selecting demonstration examples: given a new requirement, a few demonstration examples are selected from a candidate pool, where LLMs are expected to learn the pattern hidden in these selected demonstration examples. Existing approaches are mostly based on heuristics or randomly selecting examples. However, the distribution of randomly selected examples usually varies greatly, making the performance of LLMs less robust. The heuristics retrieve examples by only considering textual similarities of requirements, leading to sub-optimal performance. To fill this gap, we propose a L arge language model- A ware selection approach for I n-context- L earning-based code generation named LAIL. LAIL uses LLMs themselves to select examples. It requires LLMs themselves to label a candidate example as a positive example or a negative example for a requirement. Positive examples are helpful for LLMs to generate correct programs, while negative examples are trivial and should be ignored. Based on the labeled positive and negative data, LAIL trains a model-aware retriever to learn the preference of LLMs and select demonstration examples that LLMs need. During the inference, given a new requirement, LAIL uses the trained retriever to select a few examples and feed them into LLMs to generate desired programs. We apply LAIL to four widely used LLMs and evaluate it on five code generation datasets. Extensive experiments demonstrate that LAIL outperforms the state-of-the-art (SOTA) baselines by 11.58%, 3.33%, and 5.07% on CodeGen-Multi-16B, 1.32%, 2.29%, and 1.20% on CodeLlama-34B, and achieves 4.38%, 2.85%, and 2.74% improvements on Text-davinci-003 in terms of Pass@1 at MBJP, MBPP, and MBCPP, respectively. In addition to function-level code generation, LAIL improves the performance of LLMs on DevEval, a repository-level code generation dataset, which achieves 10.04%, 8.12%, and 4.63% improvements compared to the SOTA baselines at Pass@1, 3, and 5 on CodeLlama-7B. Human evaluation further verifies that the generated programs of LAIL are superior in correctness, code quality, and maintainability. Besides, LAIL has satisfactory transferability across different LLMs and datasets, where the retriever learned on one LLM (dataset) can be transferred to other LLMs (datasets).
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
OOK完成签到,获得积分10
刚刚
hyn完成签到,获得积分10
刚刚
sky完成签到,获得积分10
刚刚
komorebi完成签到 ,获得积分10
1秒前
无死何能生新颜完成签到,获得积分10
1秒前
2秒前
hyn发布了新的文献求助10
3秒前
小录完成签到 ,获得积分10
4秒前
峰儿背完成签到 ,获得积分10
4秒前
5秒前
5秒前
Tulip发布了新的文献求助10
7秒前
Yuluo发布了新的文献求助10
7秒前
andre20完成签到 ,获得积分10
9秒前
9秒前
xiaolizi发布了新的文献求助10
10秒前
hgg发布了新的文献求助10
10秒前
哭泣又柔发布了新的文献求助10
15秒前
从容小鸽子完成签到,获得积分10
15秒前
permanent完成签到,获得积分10
18秒前
19秒前
ywq123完成签到,获得积分10
19秒前
20秒前
21秒前
脑洞疼应助rAbit采纳,获得10
21秒前
阿敬完成签到,获得积分10
22秒前
慕容博完成签到,获得积分10
24秒前
24秒前
dove发布了新的文献求助10
24秒前
情怀应助丁丁的互助采纳,获得10
25秒前
26秒前
纪靖雁完成签到 ,获得积分10
28秒前
爆米花应助江新儿采纳,获得10
31秒前
nn完成签到,获得积分10
32秒前
Yuluo完成签到,获得积分20
34秒前
李健的粉丝团团长应助abc采纳,获得10
35秒前
123456发布了新的文献求助10
36秒前
瑾sir完成签到,获得积分10
36秒前
37秒前
dove完成签到,获得积分10
38秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Developing Genetic Editing Tools for Lysobacter 2000
卤化钙钛矿人工突触的研究 2000
Моделирование процессов самоорганизации в кристаллообразующих системах 1000
History of U.S. Space Surveillance and Satellite Cataloging 1000
Signals, Systems, and Signal Processing 610
Fundamentals of Pharmaceutical and Biologics Regulations: A Global Perspective, Second Edition 600
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6516135
求助须知:如何正确求助?哪些是违规求助? 8309177
关于积分的说明 17760359
捐赠科研通 5618410
什么是DOI,文献DOI怎么找? 2925391
邀请新用户注册赠送积分活动 1902410
关于科研通互助平台的介绍 1763529