计算机科学
矩阵乘法
并行计算
乘法(音乐)
记忆电阻器
还原(数学)
乘法算法
内存处理
二进制数
算法
算术
数学
电子工程
搜索引擎
量子
组合数学
物理
量子力学
工程类
Web搜索查询
按示例查询
情报检索
几何学
作者
Orian Leitersdorf,Ronny Ronen,Shahar Kvatinsky
标识
DOI:10.1109/iscas48785.2022.9937557
摘要
The emerging memristive Memory Processing Unit (mMPU) overcomes the memory wall through memristive devices that unite storage and logic for real processing-in-memory (PIM) systems. At the core of the mMPU is stateful logic, which is accelerated with memristive partitions to enable logic with massive inherent parallelism within crossbar arrays. This paper vastly accelerates the fundamental operations of matrix-vector multiplication and convolution in the mMPU, with either full-precision or binary elements. These proposed algorithms establish an efficient foundation for large-scale mMPU applications such as neural-networks, image processing, and numerical methods. We overcome the inherent asymmetry limitation in the previous in-memory full-precision matrix-vector multiplication solutions by utilizing techniques from block matrix multiplication and reduction. We present the first fast in-memory binary matrix-vector multiplication algorithm by utilizing memristive partitions with a tree-based popcount reduction (39$\times$ faster than previous work). For convolution, we present a novel in-memory input-parallel concept which we utilize for a full-precision algorithm that overcomes the asymmetry limitation in convolution, while also improving latency (2$\times$ faster than previous work), and the first fast binary algorithm (12$\times$ faster than previous work).
科研通智能强力驱动
Strongly Powered by AbleSci AI