计算机科学
芯片上的网络
分布式计算
横杆开关
网络拓扑
利用
并行计算
计算机网络
高效能源利用
图形处理单元的通用计算
电信
绘图
计算机图形学(图像)
计算机安全
电气工程
工程类
作者
Xianwei Cheng,Yang Zhao,Mohammadreza Robaei,Beilei Jiang,Hui Zhao,Juan Fang
标识
DOI:10.1109/ancs.2019.8901890
摘要
GPGPU accelerated systems demand high throughput in data communication in order to fully exploit thread-level parallelism. Most of current GPGPU Network-on-Chips (NoCs) employ topology adapted from CPUs, such as mesh and crossbar. However, the trade-off between performance and cost for such networks is sub-optimal, due to the unique traffic pattern of GPUs. In this work, we propose a novel NoC architecture called fused fat tree which modifies the fat tree to match GPU traffic pattern. By separately connecting memory controllers and computing cores to tree roots and leaves, protocol deadlocks can be avoided using just one physical network. However, this modification removes the advantage of path diversity in the original fat tree topology and makes the network vulnerable to hotspot-caused congestion. To solve this problem, we propose to fuse routers with side links to create multiple paths. A load-balancing routing algorithm is also proposed in order to increase network throughput. We also propose a novel preemptive bandwidth allocation scheme to improve resource utilization by taking advantage of request message slacks. Our evaluation results show that our design can improve performance by 46% while achieving 27 % and 25 % area and energy savings on the average.
科研通智能强力驱动
Strongly Powered by AbleSci AI