计算机科学
推论
效率低下
云计算
服务器
分布式计算
强化学习
任务(项目管理)
适应性
人工智能
GSM演进的增强数据速率
延迟(音频)
机器学习
计算机网络
工程类
生物
操作系统
经济
微观经济学
系统工程
电信
生态学
作者
Jingcheng Fang,Ying He,F. Richard Yu,Jianqiang Li,Victor C. M. Leung
标识
DOI:10.1109/vtc2023-fall60731.2023.10333824
摘要
As the research and applications of large language model (LLM) become increasingly sophisticated, it is difficult for resource-limited mobile terminals to run large-model inference tasks efficiently. Traditional deep reinforcement learning (DRL) based approaches have been used to offload LLM inference tasks to servers. However, existing solutions suffer from data inefficiency, insensitivity to latency requirements, and non-adaptability to task load variations. In this paper, we propose an active inference with rewardless guidance algorithm using expected future free energy for offloading decisions and allocating resources for the LLM inference task offloading and resource allocation problem of cloud-edge networks systems. Experimental results show that our proposed method has superior performance over mainstream DRLs, improves in data utilization efficiency, and is more adaptable to changing task load scenarios.
科研通智能强力驱动
Strongly Powered by AbleSci AI