物理
多边形网格
高超音速
航空航天工程
高超音速流动
比例(比率)
计算流体力学
流量(数学)
计算科学
机械
计算机图形学(图像)
计算机科学
量子力学
工程类
作者
Ye Zhang,Zhengyu Tian,Hang Yu,Feng Liu,Xiao She
摘要
The development of hypersonic vehicles presents severe challenges to computational fluid dynamics (CFD) simulation efficiency, particularly for unstructured meshes where traditional central processing unit (CPU) architectures lack scalability and graphics processing unit (GPU) implementations require further optimization. This paper constructs a multi-level parallel acceleration framework for a three-dimensional unstructured solver targeting heterogeneous architectures. Profiling reveals the intrinsic constraint of memory access on non-independent parallel kernel functions. We enhance memory efficiency across multiple dimensions, including data layout reconstruction, mesh reordering, and kernel fusion. A decoupled reordering strategy partitioning the domain into inner-halo-padding regions is designed to enable overlap of multi-GPU computation and communication while preserving data locality. Benefiting from the generality of these optimizations, the framework is easily portable to other heterogeneous platforms like deep computing units (DCUs). Tests demonstrate speedups of approximately 1600x on GPU and over 500x on DCU compared to CPU implementations, enabling efficient simulations reaching hundred-million-cell scale with excellent scalability and cross-platform capability. The proposed framework offers a reusable paradigm for optimizing high-performance unstructured CFD software, enhancing hypersonic aerodynamic assessment efficiency.
科研通智能强力驱动
Strongly Powered by AbleSci AI