排序算法
分类
计算机科学
并行计算
分类
快速排序
算法
并行算法
分拣网络
图形
拓扑排序
理论计算机科学
有向无环图
情报检索
作者
Omar Obeya,Endrias Kahssay,Edward M. Fan,Julian Shun
标识
DOI:10.1145/3323165.3323198
摘要
Parallel radix sorting has been extensively studied in the literature for decades. However, the most efficient implementations require auxiliary memory proportional to the input size, which can be prohibitive for large inputs. The standard serial in-place radix sorting algorithm is based on swapping each key to its correct place in the array on each level of recursion. However, it is not straightforward to parallelize this algorithm due to dependencies among the swaps. This paper presents Regions Sort, a new parallel in-place radix sorting algorithm that is efficient both in theory and in practice. Our algorithm uses a graph structure to model dependencies across elements that need to be swapped, and generates independent tasks from this graph that can be executed in parallel. For sorting n integers from a range r, and a parameter K, Regions Sort requires only O(Kłog rłog n) auxiliary memory. Our algorithm requires O(nłog r) work and O((n/K+łog K)łog r) span, making it work-efficient and highly parallel. In addition, we present several optimizations that significantly improve the empirical performance of our algorithm. We compare the performance of Regions Sort to existing parallel in-place and out-of-place sorting algorithms on a variety of input distributions and show that Regions Sort is faster than optimized out-of-place radix sorting and comparison sorting algorithms, and is almost always faster than the fastest publicly-available in-place sorting algorithm.
科研通智能强力驱动
Strongly Powered by AbleSci AI