A 64-core mixed-signal in-memory compute chip based on phase-change memory for deep neural network inference

计算机科学 炸薯条 半导体存储器 卷积神经网络 人工神经网络 推论 CMOS芯片 吞吐量 嵌入式系统 计算机硬件 并行计算 人工智能 电子工程 工程类 电信 无线
作者
Manuel Le Gallo,Riduan Khaddam-Aljameh,Miloš Stanisavljević,Athanasios Vasilopoulos,Benedikt Kersting,Martino Dazzi,Geethan Karunaratne,Matthias Bräendli,Abhairaj Singh,Silvia Melitta Mueller,Julian Buechel,Xavier Timoneda,Vinay Joshi,Urs Egger,Angelo Garofalo,Αναστάσιος Πετρόπουλος,Theodore Antonakopoulos,Kevin Brew,Choi, Samuel,Injo Ok
出处
期刊:Cornell University - arXiv 被引量:18
标识
DOI:10.48550/arxiv.2212.02872
摘要

The need to repeatedly shuttle around synaptic weight values from memory to processing units has been a key source of energy inefficiency associated with hardware implementation of artificial neural networks. Analog in-memory computing (AIMC) with spatially instantiated synaptic weights holds high promise to overcome this challenge, by performing matrix-vector multiplications (MVMs) directly within the network weights stored on a chip to execute an inference workload. However, to achieve end-to-end improvements in latency and energy consumption, AIMC must be combined with on-chip digital operations and communication to move towards configurations in which a full inference workload is realized entirely on-chip. Moreover, it is highly desirable to achieve high MVM and inference accuracy without application-wise re-tuning of the chip. Here, we present a multi-core AIMC chip designed and fabricated in 14-nm complementary metal-oxide-semiconductor (CMOS) technology with backend-integrated phase-change memory (PCM). The fully-integrated chip features 64 256x256 AIMC cores interconnected via an on-chip communication network. It also implements the digital activation functions and processing involved in ResNet convolutional neural networks and long short-term memory (LSTM) networks. We demonstrate near software-equivalent inference accuracy with ResNet and LSTM networks while implementing all the computations associated with the weight layers and the activation functions on-chip. The chip can achieve a maximal throughput of 63.1 TOPS at an energy efficiency of 9.76 TOPS/W for 8-bit input/output matrix-vector multiplications.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
kuyi驳回了Hello应助
刚刚
1秒前
1秒前
欣喜沛芹完成签到,获得积分10
1秒前
远志发布了新的文献求助10
1秒前
开心聪展完成签到,获得积分10
2秒前
jetlee发布了新的文献求助10
2秒前
2秒前
3秒前
4秒前
蓝天发布了新的文献求助30
6秒前
6秒前
neverland完成签到,获得积分10
6秒前
北风发布了新的文献求助10
6秒前
饭饭完成签到,获得积分10
7秒前
CXY发布了新的文献求助10
8秒前
sun关注了科研通微信公众号
8秒前
9秒前
JamesPei应助神仙没有草原采纳,获得10
9秒前
大个应助自觉的宝贝采纳,获得10
9秒前
闪闪发光的珊珊完成签到,获得积分10
9秒前
10秒前
Lijunjie发布了新的文献求助10
10秒前
10秒前
kiki发布了新的文献求助200
10秒前
Hello应助蓝天采纳,获得30
11秒前
Jimmy Ko完成签到,获得积分10
11秒前
Itzflames978应助4100采纳,获得10
12秒前
Leanne应助豆豆采纳,获得10
13秒前
慧的茶发布了新的文献求助20
14秒前
15秒前
果汁橡皮糖完成签到,获得积分10
15秒前
Jimmy Ko发布了新的文献求助10
16秒前
专注的思菱完成签到,获得积分10
16秒前
16秒前
18秒前
hua完成签到 ,获得积分10
18秒前
18秒前
18秒前
gnemnauy完成签到,获得积分10
19秒前
高分求助中
Psychopathic Traits and Quality of Prison Life 1000
Chemistry and Physics of Carbon Volume 18 800
The formation of Australian attitudes towards China, 1918-1941 660
Signals, Systems, and Signal Processing 610
天津市智库成果选编 600
Forced degradation and stability indicating LC method for Letrozole: A stress testing guide 500
全相对论原子结构与含时波包动力学的理论研究--清华大学 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6451457
求助须知:如何正确求助?哪些是违规求助? 8263394
关于积分的说明 17607846
捐赠科研通 5516279
什么是DOI,文献DOI怎么找? 2903695
邀请新用户注册赠送积分活动 1880647
关于科研通互助平台的介绍 1722662