数据并行性
计算机科学
平行性(语法)
并行计算
过程(计算)
任务并行性
工作量
极限(数学)
人工神经网络
指令级并行
人工智能
程序设计语言
数学分析
数学
操作系统
作者
Weizheng Xu,Youtao Zhang,Xulong Tang
出处
期刊:Companion Proceedings of the The Web Conference 2018
日期:2021-04-19
卷期号:: 174-178
被引量:13
标识
DOI:10.1145/3442442.3452055
摘要
In recent years, Deep Neural Networks (DNNs) have emerged as a widely adopted approach in many application domains. Training DNN models is also becoming a significant fraction of the datacenter workload. Recent evidence has demonstrated that modern DNNs are becoming more complex and the size of DNN parameters (i.e., weights) is also increasing. In addition, a large amount of input data is required to train the DNN models to reach target accuracy. As a result, the training performance becomes one of the major challenges that limit DNN adoption in real-world applications. Recent works have explored different parallelism strategies (i.e., data parallelism and model parallelism) and used multi-GPUs in datacenters to accelerate the training process. However, naively adopting data parallelism and model parallelism across multiple GPUs can lead to sub-optimal executions. The major reasons are i) the large amount of data movement that prevents the system from feeding the GPUs with the required data in a timely manner (for data parallelism); and ii) low GPU utilization caused by data dependency between layers that placed on different devices (for model parallelism).
科研通智能强力驱动
Strongly Powered by AbleSci AI