计算机科学
韦尔莱积分法
并行计算
加速度
图形处理单元
计算科学
分子动力学
缩放比例
代码库
绘图
库达
图形处理单元的通用计算
加速
水准点(测量)
可扩展性
巨量平行
中央处理器
超级计算机
协处理器
数据结构
硬件加速
线性比例尺
接口(物质)
单精度浮点格式
动力学仿真
并行处理
GPU群集
解算器
英菲尼班德
作者
Szilárd Páll,Artem Zhmurov,Paul Bauer,Mark Abraham,Magnus Lundborg,Alan Gray,Berk Hess,Erik Lindahl
摘要
The introduction of accelerator devices such as graphics processing units (GPUs) has had profound impact on molecular dynamics simulations and has enabled order-of-magnitude performance advances using commodity hardware. To fully reap these benefits, it has been necessary to reformulate some of the most fundamental algorithms, including the Verlet list, pair searching, and cutoffs. Here, we present the heterogeneous parallelization and acceleration design of molecular dynamics implemented in the GROMACS codebase over the last decade. The setup involves a general cluster-based approach to pair lists and non-bonded pair interactions that utilizes both GPU and central processing unit (CPU) single instruction, multiple data acceleration efficiently, including the ability to load-balance tasks between CPUs and GPUs. The algorithm work efficiency is tuned for each type of hardware, and to use accelerators more efficiently, we introduce dual pair lists with rolling pruning updates. Combined with new direct GPU-GPU communication and GPU integration, this enables excellent performance from single GPU simulations through strong scaling across multiple GPUs and efficient multi-node parallelization.
科研通智能强力驱动
Strongly Powered by AbleSci AI