QR分解
可并行流形
多核处理器
并行计算
计算
计算机科学
库达
LU分解
最小二乘函数近似
预处理程序
算法
基质(化学分析)
迭代法
图形处理单元
集合(抽象数据类型)
矩阵分解
数学
特征向量
物理
统计
材料科学
量子力学
复合材料
估计员
程序设计语言
作者
Yuancheng Luo,Ramani Duraiswami
摘要
We parallelize a version of the active-set iterative algorithm derived from the original works of Lawson and Hanson [Solving Least Squares Problems, Prentice-Hall, 1974] on multicore architectures. This algorithm requires the solution of an unconstrained least squares problem in every step of the iteration for a matrix composed of the passive columns of the original system matrix. To achieve improved performance, we use parallelizable procedures to efficiently update and downdate the $QR$ factorization of the matrix at each iteration, to account for inserted and removed columns. We use a reordering strategy of the columns in the decomposition to reduce computation and memory access costs. We consider graphics processing units (GPUs) as a new mode for efficient parallel computations and compare our implementations to that of multicore CPUs. Both synthetic and nonsynthetic data are used in the experiments.
科研通智能强力驱动
Strongly Powered by AbleSci AI