Lv1
28 积分 2024-09-11 加入
Efficient Memory Management for Large Language Model Serving with PagedAttention
2个月前
已完结
FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving
2个月前
已完结
Straggler-Aware Gradient Aggregation for Large-Scale Distributed Deep Learning System
3个月前
已完结
Intelligent In-Network Attack Detection on Programmable Switches With Soterv2
3个月前
已完结
P4-Secure: In-Band DDoS Detection in Software Defined Networks
3个月前
已完结
FCC: Fast Fair Congestion Control in Data Center Networks with RDMA
3个月前
已完结
CacheGen: KV Cache Compression and Streaming for Fast Large Language Model Serving
3个月前
已完结
Towards Efficient Secure Aggregation based on In-Network Computing
6个月前
已完结
Accelerating Federated Learning at Programmable User Plane Function via In-Network Aggregation
6个月前
已完结
DINA: Toward Determined In-Network Aggregation for Distributed Machine Learning
6个月前
已完结