合并排序
计算机科学
并行计算
合并算法
排序算法
合并(版本控制)
SIMD公司
分类
加速
快速排序
算法
情报检索
作者
Alex Watkins,Oded Green
出处
期刊:Irregular Applications: Architectures and Algorithms
日期:2018-11-01
被引量:3
标识
DOI:10.1109/ia3.2018.00012
摘要
Merging and sorting algorithms are the backbone of many modern computer applications. As such, efficient implementations are desired. Recent architectural advancements in CPUs (Central Processing Units), such as wider and more powerful vector instructions, allow for algorithmic improvements. This paper presents a new approach to merge sort using vector instructions. Traditional approaches to vectorized sorting typically utilize a bitonic sorting network (Batcher's Algorithm) which adds significant overhead. Our approach eliminates the overhead from this approach. We start with a branch-avoiding merge algorithm and then use the Merge Path algorithm to split up merging between the different SIMD lanes. Testing demonstrates that the algorithm not only surpasses the SIMD based bitonic counterpart, but that it is over 2.94x faster than a traditional merge, merging over 300M keys per second in one thread and over 16B keys per second in parallel. Our new sort reaches is over 5x faster than quicksort and 2x faster than Intel's IPP library sort, sorting over 5.3M keys per second for a single processor and in parallel over 500M keys per second and a speedup of over 2x from a traditional merge sort.
科研通智能强力驱动
Strongly Powered by AbleSci AI