变压器
计算机科学
同种类的
电气工程
工程类
电压
物理
热力学
作者
Sangyeob Kim,Sangjin Kim,Wooyoung Jo,Soyeon Kim,Seongyon Hong,Nayeong Lee,Jungwan Lee,Hoi‐Jun Yoo
出处
期刊:IEEE Journal of Solid-state Circuits
[Institute of Electrical and Electronics Engineers]
日期:2025-04-15
卷期号:60 (10): 3802-3815
被引量:4
标识
DOI:10.1109/jssc.2025.3554699
摘要
In this article, we propose a new language model processor named the C-Transformer to address the external memory bottleneck of language models. It consists of three key functional blocks: 1) homogeneous deep-neural-network (DNN)–Transformer/spiking-Transformer core (HDSC) with a hybrid multiplication/accumulation unit (HMAU) to enhance hardware utilization; 2) output spike speculation unit (OSSU) to increase the energy efficiency of spike-domain processing; and 3) implicit weight generation unit (IWGU) with extended sign compression (ESC) to eliminate the external memory bottleneck. The chip is fabricated in Samsung’s 28-nm 1P8M CMOS technology and operates at a supply voltage of 0.7–1.1 V with a maximum frequency of 200 MHz and supports various tasks such as language modeling, translation, and summarization. For models like generative pre-trained transformer 2 (GPT-2), multilingual text-to-text transfer transformer (mT5), text-to-text transfer transformer (T5), and FairSeq MachineTranslation (FSMT), the C-Transformer achieves 0.21×–0.33× computation energy and 0.37×–0.41× external memory access (EMA) energy compared to the baseline. Our chip demonstrates 13.6% lower energy consumption than the previous state-of-the-art, despite having 2.1× more parameters. Moreover, it consumes 63.8% less energy with a similar parameter size. The C-Transformer can complete various language model tasks with <1 s latency, notably FSMT in 0.09 s and GPT-2 in 0.656 s. By combining DNN–Transformer and Spiking-Transformer architectures, the C-Transformer enhances computational energy efficiency, eliminates external memory bottlenecks, and enables language models such as GPT-2 to achieve state-of-the-art performance on mobile devices.
科研通智能强力驱动
Strongly Powered by AbleSci AI