SIMD公司
计算机科学
并行计算
加速
算法
浮点型
核(代数)
高斯分布
高斯模糊
高斯滤波器
x86个
卷积(计算机科学)
计算科学
图像处理
数学
图像(数学)
人工智能
图像复原
离散数学
物理
软件
量子力学
人工神经网络
程序设计语言
作者
Ekaterina O. Rybakova,Elena Limonova,Dmitry Nikolaev
摘要
Gaussian filtering, being a convolution with a Gaussian kernel, is a widespread technique in image analysis and computer vision applications. It is the traditional approach for noise reduction. In some cases, performing the exact convolution can be computationally expensive and time-consuming. To address this problem, approximations of the convolution are often used to achieve a balance between accuracy and computational efficiency, such as with running sums, Bell blur, Deriche approximation, etc. At the same time, modern computing devices support data parallelism (vectorization) via Single Instruction Multiple Data (SIMD) and can process integer numbers faster than floating-point approaches. In this paper, we describe several methods for approximating a Gaussian filter, implement the SIMD and quantized versions, and compare them in terms of speed and accuracy. The experiments were performed on central processing units with a x86_64 architecture using a family of SSE SIMD extensions and an ARMv8 architecture using the NEON SIMD extension. All the optimized approximations demonstrated 10–20× speedup while maintaining the accuracy in the range of 1 × 10−5 or higher. The fastest method is a trivial Stack blur with a relatively high error, so we recommend using the second-order Vliet–Young–Verbeek filter and quantized Bell blur and running sums as more accurate and still computationally efficient alternatives.
科研通智能强力驱动
Strongly Powered by AbleSci AI