Design-Technology Co-Optimization for NVM-Based Neuromorphic Processing Elements

神经形态工程学计算机科学高效能源利用粒度延迟（音频）计算机体系结构嵌入式系统非易失性存储器杠杆（统计）计算机硬件人工神经网络人工智能工程类操作系统电气工程电信

作者

Shihao Song,Adarsha Balaji,Anup Das,Nagarajan Kandasamy

出处

期刊：ACM Transactions in Embedded Computing Systems [Association for Computing Machinery]
日期：2022-03-21 卷期号：21 (6): 1-27 被引量：16

链接

osti.govdoi.org

标识

DOI：10.1145/3524068

摘要

An emerging use case of machine learning (ML) is to train a model on a high-performance system and deploy the trained model on energy-constrained embedded systems. Neuromorphic hardware platforms, which operate on principles of the biological brain, can significantly lower the energy overhead of an ML inference task, making these platforms an attractive solution for embedded ML systems. We present a design-technology tradeoff analysis to implement such inference tasks on the processing elements (PEs) of a non-volatile memory (NVM)-based neuromorphic hardware. Through detailed circuit-level simulations at scaled process technology nodes, we show the negative impact of technology scaling on the information-processing latency, which impacts the quality of service of an embedded ML system. At a finer granularity, the latency inside a PE depends on (1) the delay introduced by parasitic components on its current paths, and (2) the varying delay to sense different resistance states of its NVM cells. Based on these two observations, we make the following three contributions. First, on the technology front, we propose an optimization scheme where the NVM resistance state that takes the longest time to sense is set on current paths having the least delay, and vice versa, reducing the average PE latency, which improves the quality of service. Second, on the architecture front, we introduce isolation transistors within each PE to partition it into regions that can be individually power-gated, reducing both latency and energy. Finally, on the system-software front, we propose a mechanism to leverage the proposed technological and architectural enhancements when implementing an ML inference task on neuromorphic PEs of the hardware. Evaluations with a recent neuromorphic hardware architecture show that our proposed design-technology co-optimization approach improves both performance and energy efficiency of ML inference tasks without incurring high cost-per-bit.

求助该文献

Design-Technology Co-Optimization for NVM-Based Neuromorphic Processing Elements

今日热心研友