SIMD公司
矢量化(数学)
并行计算
计算机科学
乘法(音乐)
编码(集合论)
矢量处理机
循环(图论)
嵌套循环联接
计算科学
程序设计语言
数学
组合数学
集合(抽象数据类型)
作者
Christopher Rodrigues,Amarin Phaosawasdi,Peng Wu
标识
DOI:10.1145/3178433.3178436
摘要
Developers often rely on automatic vectorization to speed up fine-grained data-parallel code. However, for loop nests where the loops are shorter than the processor's SIMD width, automatic vectorization performs poorly. Vectorizers attempt to vectorize a single short loop, using (at best) a fraction of the processor's SIMD capacity. It is not straightforward to vectorize multiple nested loops together because they typically have memory accesses with multiple strides, which conventional methods cannot profitably vectorize.
科研通智能强力驱动
Strongly Powered by AbleSci AI