计算机科学
相似性(几何)
机器学习
人工智能
质量(理念)
比例(比率)
涡轮
工程类
哲学
物理
认识论
量子力学
汽车工程
图像(数学)
作者
Mateusz Kochanek,Igor Cichecki,Oliwier Kaszyca,Dominika Szydło,Michał Madej,Dawid Jędrzejewski,Przemysław Kazienko,Jan Kocoń
出处
期刊:Electronics
[Multidisciplinary Digital Publishing Institute]
日期:2024-06-08
卷期号:13 (12): 2255-2255
被引量:8
标识
DOI:10.3390/electronics13122255
摘要
The rapid evolution of large language models, in particular OpenAI’s GPT-3.5-turbo and GPT-4, indicates a growing interest in advanced computational methodologies. This paper proposes a novel approach to synthetic data generation and knowledge distillation through prompt engineering. The potential of large language models (LLMs) is used to address the problem of unbalanced training datasets for other machine learning models. This is not only a common issue but also a crucial determinant of the final model quality and performance. Three prompting strategies have been considered: basic, composite, and similarity prompts. Although the initial results do not match the performance of comprehensive datasets, the similarity prompts method exhibits considerable promise, thus outperforming other methods. The investigation of our rebalancing methods opens pathways for future research on leveraging continuously developed LLMs for the enhanced generation of high-quality synthetic data. This could have an impact on many large-scale engineering applications.
科研通智能强力驱动
Strongly Powered by AbleSci AI