Lv2
120 积分 2025-11-09 加入
Reliable and Efficient LLM Inference on Resource-Constrained Mobile Devices via Dynamic Scheduling
2天前
已完结
TightLLM: Maximizing Throughput for LLM Inference via Adaptive Offloading Policy
2天前
已完结
Research on lightweight optimization of large language models for resource-constrained environments
3个月前
已完结
Edge-LLM: A Collaborative Framework for Large Language Model Serving in Edge Computing
4个月前
已完结
Enhancing LLM QoS through Cloud-Edge Collaboration: A Diffusion-based Multi-Agent Reinforcement Learning Approach
4个月前
已完结
Joint Inference Offloading and Model Caching for Small and Large Language Model Collaboration
4个月前
已完结
Large Language Models (LLMs) Inference Offloading and Resource Allocation in Cloud-Edge Computing: An Active Inference Approach
4个月前
已完结
FlexLLM: Token-Level Co-Serving of LLM Inference and Finetuning with SLO Guarantees
4个月前
已完结
Joint Inference Offloading and Model Caching for Small and Large Language Model Collaboration
4个月前
已完结
EdgeNetLLM: Cloud–Edge Collaborative Adaptation of Large Language Models for Mobile Networking
4个月前
已完结