远程直接内存访问
计算机科学
吞吐量
云计算
计算机网络
英菲尼班德
操作系统
无线
作者
Yanqing Chen,Chen Tian,Jiaqing Dong,Song Feng,Xu Zhang,Chang Liu,Peiwen Yu,Nai Xia,Wanchun Dou,Guihai Chen
出处
期刊:IEEE Transactions on Parallel and Distributed Systems
[Institute of Electrical and Electronics Engineers]
日期:2023-01-01
卷期号:34 (1): 63-75
标识
DOI:10.1109/tpds.2022.3215517
摘要
Remote Direct Memory Access (RDMA) has been widely deployed in datacenters for its high performance. Large-scale high performance cloud services built on geographically distributed datacenters require long-range RDMA for performance requirements. However, existing RDMA solutions can hardly satisfy the stringent requirements of the emerging large-scale high-performance cloud services built on geo-distributed datacenters in terms of throughput and delay. On the one hand, lossless RDMA suffers from a deep buffer and potential suboptimal throughput for inter-datacenter traffic due to delayed response to Priority Flow Control (PFC) messages. On the other hand, lossy RDMA with selective retransmissions suffers from poor performance when multiple flows with different round-trip times (RTTs) coexist in cross-datacenter scenarios. This article proposes Swing , which expands the high-performance lossless RDMA to long-distance links through PFC-Relay. Swing ensures the throughput of long-distance links while minimizing the buffer requirement for long-range RDMA. It enables long-range RDMA without making any modifications to existing in-datacenter networks. The evaluation shows that Swing can reduce the average flow completion time (FCT) by 14%-66% in a variety of traffic scenarios.
科研通智能强力驱动
Strongly Powered by AbleSci AI